Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-19 Thread Mauro Carvalho Chehab
Em Fri, 19 Jul 2013 01:27:18 +0200
Borislav Petkov  escreveu:

> On Thu, Jul 18, 2013 at 04:51:48PM +, Luck, Tony wrote:
> > +   BUG_ON(mci->mc_idx >= EDAC_MAX_MCS);
> > 
> > Do we have to "BUG_ON()" here?  Couldn't we be gentler with something like:
> > 
> > if (mci->mc_idx >= EDAC_MAX_MCS) {
> > printk_once(KERN_WARNING "Too many memory controllers\n");
> > return; /* probably need to make sure caller copes with this 
> > ... so more stuff there */
> 
> Yeah, we can do something like this:

With this change, the patch looks ok for me.

Acked-by: Mauro Carvalho Chehab 
> 
> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index 429e971e02d7..c55ad285c285 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -725,6 +725,11 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
>   int ret = -EINVAL;
>   edac_dbg(0, "\n");
>  
> + if (mci->mc_idx >= EDAC_MAX_MCS) {
> + pr_warn_once("Too many memory controllers: %d\n", mci->mc_idx);
> + return ret;
> + }
> +
>  #ifdef CONFIG_EDAC_DEBUG
>   if (edac_debug_level >= 3)
>   edac_mc_dump_mci(mci);
> --
> 
> right near the beginning of the function so that we can save us the
> unwinding.
> 




Cheers,
Mauro
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-19 Thread Mauro Carvalho Chehab
Em Fri, 19 Jul 2013 01:27:18 +0200
Borislav Petkov b...@alien8.de escreveu:

 On Thu, Jul 18, 2013 at 04:51:48PM +, Luck, Tony wrote:
  +   BUG_ON(mci-mc_idx = EDAC_MAX_MCS);
  
  Do we have to BUG_ON() here?  Couldn't we be gentler with something like:
  
  if (mci-mc_idx = EDAC_MAX_MCS) {
  printk_once(KERN_WARNING Too many memory controllers\n);
  return; /* probably need to make sure caller copes with this 
  ... so more stuff there */
 
 Yeah, we can do something like this:

With this change, the patch looks ok for me.

Acked-by: Mauro Carvalho Chehab m.che...@samsung.com
 
 diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
 index 429e971e02d7..c55ad285c285 100644
 --- a/drivers/edac/edac_mc.c
 +++ b/drivers/edac/edac_mc.c
 @@ -725,6 +725,11 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
   int ret = -EINVAL;
   edac_dbg(0, \n);
  
 + if (mci-mc_idx = EDAC_MAX_MCS) {
 + pr_warn_once(Too many memory controllers: %d\n, mci-mc_idx);
 + return ret;
 + }
 +
  #ifdef CONFIG_EDAC_DEBUG
   if (edac_debug_level = 3)
   edac_mc_dump_mci(mci);
 --
 
 right near the beginning of the function so that we can save us the
 unwinding.
 




Cheers,
Mauro
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-18 Thread Borislav Petkov
On Thu, Jul 18, 2013 at 04:51:48PM +, Luck, Tony wrote:
> + BUG_ON(mci->mc_idx >= EDAC_MAX_MCS);
> 
> Do we have to "BUG_ON()" here?  Couldn't we be gentler with something like:
> 
>   if (mci->mc_idx >= EDAC_MAX_MCS) {
>   printk_once(KERN_WARNING "Too many memory controllers\n");
>   return; /* probably need to make sure caller copes with this 
> ... so more stuff there */

Yeah, we can do something like this:

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 429e971e02d7..c55ad285c285 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -725,6 +725,11 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
int ret = -EINVAL;
edac_dbg(0, "\n");
 
+   if (mci->mc_idx >= EDAC_MAX_MCS) {
+   pr_warn_once("Too many memory controllers: %d\n", mci->mc_idx);
+   return ret;
+   }
+
 #ifdef CONFIG_EDAC_DEBUG
if (edac_debug_level >= 3)
edac_mc_dump_mci(mci);
--

right near the beginning of the function so that we can save us the
unwinding.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-18 Thread Luck, Tony
+   BUG_ON(mci->mc_idx >= EDAC_MAX_MCS);

Do we have to "BUG_ON()" here?  Couldn't we be gentler with something like:

if (mci->mc_idx >= EDAC_MAX_MCS) {
printk_once(KERN_WARNING "Too many memory controllers\n");
return; /* probably need to make sure caller copes with this 
... so more stuff there */
}

-Tony
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-18 Thread Borislav Petkov
On Fri, Jul 12, 2013 at 04:19:24PM +, Luck, Tony wrote:
> > What would be a reasonable maximum limit for the number of memory
> > controllers, on a -EX machine?
>
> Westmere-EX has one memory controller per socket ... and there are
> glueless systems up to 8 sockets. So 8 there. Not sure if any OEM is
> building larger machines with a node controller (SGI? Not sure if they
> build their behemoths from -EP or -EX parts).
>
> Ivy Bridge ups the ante with two memory controllers on a socket. So
> plan on doubling soon.

Let's give it a second try, 16 memory controllers max:

---
>From 18fec2fd4279640b9f471c28aa3a5dc8be104273 Mon Sep 17 00:00:00 2001
From: Borislav Petkov 
Date: Fri, 12 Jul 2013 10:53:38 +0200
Subject: [PATCH] EDAC: Fix lockdep splat

Fix the following:

BUG: key 88043bdd0330 not in .data!
[ cut here ]
WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x565/0x5a0()
DEBUG_LOCKS_WARN_ON(1)
Modules linked in: glue_helper sb_edac(+) edac_core snd acpi_cpufreq lrw 
gf128mul ablk_helper iTCO_wdt evdev i2c_i801 dcdbas button cryptd pcspkr 
iTCO_vendor_support usb_common lpc_ich mfd_core soundcore mperf processor 
microcode
CPU: 2 PID: 599 Comm: modprobe Not tainted 3.10.0 #1
Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013
 0009 880439a1d920 8160a9a9 880439a1d958
 8103d9e0 88043af4a510 81a16e11 
 88043bdd0330  880439a1d9b8 8103dacc
Call Trace:
  dump_stack
  warn_slowpath_common
  warn_slowpath_fmt
  lockdep_init_map
  ? trace_hardirqs_on_caller
  ? trace_hardirqs_on
  debug_mutex_init
  __mutex_init
  bus_register
  edac_create_sysfs_mci_device
  edac_mc_add_mc
  sbridge_probe
  pci_device_probe
  driver_probe_device
  __driver_attach
  ? driver_probe_device
  bus_for_each_dev
  driver_attach
  bus_add_driver
  driver_register
  __pci_register_driver
  ? 0xa0010fff
  sbridge_init
  ? 0xa0010fff
  do_one_initcall
  load_module
  ? unset_module_init_ro_nx
  SyS_init_module
  tracesys
---[ end trace d24a70b0d3ddf733 ]---
EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 
:3f:0e.0
EDAC sbridge: Driver loaded.

What happens is that bus_register needs a statically allocated lock_key
because it is handed in to lockdep. However, struct mem_ctl_info embeds
struct bus_type (the whole struct, not a pointer to it) which gets
dynamically allocated.

Fix this by using a statically allocated struct bus_type for the MC bus.

Cc: Mauro Carvalho Chehab 
Cc: Markus Trippelsdorf 
Signed-off-by: Borislav Petkov 
---
 drivers/edac/edac_mc.c   |  6 ++
 drivers/edac/edac_mc_sysfs.c | 28 +++-
 drivers/edac/i5100_edac.c|  2 +-
 include/linux/edac.h |  7 ++-
 4 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 27e86d938262..429e971e02d7 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -48,6 +48,8 @@ static LIST_HEAD(mc_devices);
  */
 static void const *edac_mc_owner;
 
+static struct bus_type mc_bus[EDAC_MAX_MCS];
+
 unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
 unsigned len)
 {
@@ -762,6 +764,10 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
/* set load time so that error rate can be tracked */
mci->start_time = jiffies;
 
+   BUG_ON(mci->mc_idx >= EDAC_MAX_MCS);
+
+   mci->bus = _bus[mci->mc_idx];
+
if (edac_create_sysfs_mci_device(mci)) {
edac_mc_printk(mci, KERN_WARNING,
"failed to create sysfs device\n");
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index ef15a7e613bc..e7c32c4f7837 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -370,7 +370,7 @@ static int edac_create_csrow_object(struct mem_ctl_info 
*mci,
return -ENODEV;
 
csrow->dev.type = _attr_type;
-   csrow->dev.bus = >bus;
+   csrow->dev.bus = mci->bus;
device_initialize(>dev);
csrow->dev.parent = >dev;
csrow->mci = mci;
@@ -605,7 +605,7 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
dimm->mci = mci;
 
dimm->dev.type = _attr_type;
-   dimm->dev.bus = >bus;
+   dimm->dev.bus = mci->bus;
device_initialize(>dev);
 
dimm->dev.parent = >dev;
@@ -975,11 +975,13 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 * The memory controller needs its own bus, in order to avoid
 * namespace conflicts at /sys/bus/edac.
 */
-   mci->bus.name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
-   if (!mci->bus.name)
+   mci->bus->name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
+   if (!mci->bus->name)
return -ENOMEM;
-   edac_dbg(0, "creating bus %s\n", mci->bus.name);
-   err = 

Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-18 Thread Borislav Petkov
On Fri, Jul 12, 2013 at 04:19:24PM +, Luck, Tony wrote:
  What would be a reasonable maximum limit for the number of memory
  controllers, on a -EX machine?

 Westmere-EX has one memory controller per socket ... and there are
 glueless systems up to 8 sockets. So 8 there. Not sure if any OEM is
 building larger machines with a node controller (SGI? Not sure if they
 build their behemoths from -EP or -EX parts).

 Ivy Bridge ups the ante with two memory controllers on a socket. So
 plan on doubling soon.

Let's give it a second try, 16 memory controllers max:

---
From 18fec2fd4279640b9f471c28aa3a5dc8be104273 Mon Sep 17 00:00:00 2001
From: Borislav Petkov b...@suse.de
Date: Fri, 12 Jul 2013 10:53:38 +0200
Subject: [PATCH] EDAC: Fix lockdep splat

Fix the following:

BUG: key 88043bdd0330 not in .data!
[ cut here ]
WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x565/0x5a0()
DEBUG_LOCKS_WARN_ON(1)
Modules linked in: glue_helper sb_edac(+) edac_core snd acpi_cpufreq lrw 
gf128mul ablk_helper iTCO_wdt evdev i2c_i801 dcdbas button cryptd pcspkr 
iTCO_vendor_support usb_common lpc_ich mfd_core soundcore mperf processor 
microcode
CPU: 2 PID: 599 Comm: modprobe Not tainted 3.10.0 #1
Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013
 0009 880439a1d920 8160a9a9 880439a1d958
 8103d9e0 88043af4a510 81a16e11 
 88043bdd0330  880439a1d9b8 8103dacc
Call Trace:
  dump_stack
  warn_slowpath_common
  warn_slowpath_fmt
  lockdep_init_map
  ? trace_hardirqs_on_caller
  ? trace_hardirqs_on
  debug_mutex_init
  __mutex_init
  bus_register
  edac_create_sysfs_mci_device
  edac_mc_add_mc
  sbridge_probe
  pci_device_probe
  driver_probe_device
  __driver_attach
  ? driver_probe_device
  bus_for_each_dev
  driver_attach
  bus_add_driver
  driver_register
  __pci_register_driver
  ? 0xa0010fff
  sbridge_init
  ? 0xa0010fff
  do_one_initcall
  load_module
  ? unset_module_init_ro_nx
  SyS_init_module
  tracesys
---[ end trace d24a70b0d3ddf733 ]---
EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 
:3f:0e.0
EDAC sbridge: Driver loaded.

What happens is that bus_register needs a statically allocated lock_key
because it is handed in to lockdep. However, struct mem_ctl_info embeds
struct bus_type (the whole struct, not a pointer to it) which gets
dynamically allocated.

Fix this by using a statically allocated struct bus_type for the MC bus.

Cc: Mauro Carvalho Chehab mche...@infradead.org
Cc: Markus Trippelsdorf mar...@trippelsdorf.de
Signed-off-by: Borislav Petkov b...@suse.de
---
 drivers/edac/edac_mc.c   |  6 ++
 drivers/edac/edac_mc_sysfs.c | 28 +++-
 drivers/edac/i5100_edac.c|  2 +-
 include/linux/edac.h |  7 ++-
 4 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 27e86d938262..429e971e02d7 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -48,6 +48,8 @@ static LIST_HEAD(mc_devices);
  */
 static void const *edac_mc_owner;
 
+static struct bus_type mc_bus[EDAC_MAX_MCS];
+
 unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
 unsigned len)
 {
@@ -762,6 +764,10 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
/* set load time so that error rate can be tracked */
mci-start_time = jiffies;
 
+   BUG_ON(mci-mc_idx = EDAC_MAX_MCS);
+
+   mci-bus = mc_bus[mci-mc_idx];
+
if (edac_create_sysfs_mci_device(mci)) {
edac_mc_printk(mci, KERN_WARNING,
failed to create sysfs device\n);
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index ef15a7e613bc..e7c32c4f7837 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -370,7 +370,7 @@ static int edac_create_csrow_object(struct mem_ctl_info 
*mci,
return -ENODEV;
 
csrow-dev.type = csrow_attr_type;
-   csrow-dev.bus = mci-bus;
+   csrow-dev.bus = mci-bus;
device_initialize(csrow-dev);
csrow-dev.parent = mci-dev;
csrow-mci = mci;
@@ -605,7 +605,7 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
dimm-mci = mci;
 
dimm-dev.type = dimm_attr_type;
-   dimm-dev.bus = mci-bus;
+   dimm-dev.bus = mci-bus;
device_initialize(dimm-dev);
 
dimm-dev.parent = mci-dev;
@@ -975,11 +975,13 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 * The memory controller needs its own bus, in order to avoid
 * namespace conflicts at /sys/bus/edac.
 */
-   mci-bus.name = kasprintf(GFP_KERNEL, mc%d, mci-mc_idx);
-   if (!mci-bus.name)
+   mci-bus-name = kasprintf(GFP_KERNEL, mc%d, mci-mc_idx);
+   if (!mci-bus-name)
return -ENOMEM;
-   edac_dbg(0, 

RE: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-18 Thread Luck, Tony
+   BUG_ON(mci-mc_idx = EDAC_MAX_MCS);

Do we have to BUG_ON() here?  Couldn't we be gentler with something like:

if (mci-mc_idx = EDAC_MAX_MCS) {
printk_once(KERN_WARNING Too many memory controllers\n);
return; /* probably need to make sure caller copes with this 
... so more stuff there */
}

-Tony
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a���
0��h���i

Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-18 Thread Borislav Petkov
On Thu, Jul 18, 2013 at 04:51:48PM +, Luck, Tony wrote:
 + BUG_ON(mci-mc_idx = EDAC_MAX_MCS);
 
 Do we have to BUG_ON() here?  Couldn't we be gentler with something like:
 
   if (mci-mc_idx = EDAC_MAX_MCS) {
   printk_once(KERN_WARNING Too many memory controllers\n);
   return; /* probably need to make sure caller copes with this 
 ... so more stuff there */

Yeah, we can do something like this:

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 429e971e02d7..c55ad285c285 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -725,6 +725,11 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
int ret = -EINVAL;
edac_dbg(0, \n);
 
+   if (mci-mc_idx = EDAC_MAX_MCS) {
+   pr_warn_once(Too many memory controllers: %d\n, mci-mc_idx);
+   return ret;
+   }
+
 #ifdef CONFIG_EDAC_DEBUG
if (edac_debug_level = 3)
edac_mc_dump_mci(mci);
--

right near the beginning of the function so that we can save us the
unwinding.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Luck, Tony
> What would be a reasonable maximum limit for the number of memory
> controllers, on a -EX machine?

Westmere-EX has one memory controller per socket ... and there are glueless 
systems up to 8 sockets.  So 8 there. Not sure if any OEM is building larger 
machines with a node controller (SGI? Not sure if they build their behemoths 
from -EP or -EX parts).

Ivy Bridge ups the ante with two memory controllers on a socket. So plan on 
doubling soon.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Mauro Carvalho Chehab
Em Fri, 12 Jul 2013 16:21:06 +0200
Borislav Petkov  escreveu:

> On Fri, Jul 12, 2013 at 10:57:41AM -0300, Mauro Carvalho Chehab wrote:
> > This will be overriding the content of the static var mc_bus every for
> > every new memory controller. Are you sure that bus.name is only used
> > on register, or if its contents is stored somewhere?
> 
> bus_register does kobject_set_name which copies bus->name, for example,

Ok, so, it could be safe.

> but I didn't look exhaustively.

Did you try to remove and reinsert the edac driver a few times, on a
multi-memory controller machine? The bus nodes got created properly?
> 
> Just to be on the safe side, I should probably do a
> 
> static const char **bus_names = { "mc0", "mc1", ..., "mc7" };

You would likely to use an array for the bus_type too, if reusing
the static one is an issue.

> and use it. Are 8 enough for your edac drivers too?

With edac_ghes, I suspect that the worse case, on Intel side, is the
Nehalem/Sandy Bridge/Ivy Bridge EX machines.

Tony,

What would be a reasonable maximum limit for the number of memory
controllers, on a -EX machine?

Cheers,
Mauro
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Borislav Petkov
On Fri, Jul 12, 2013 at 04:28:44PM +0200, Markus Trippelsdorf wrote:
> Yes, it's working fine here, too. Thanks Boris.

Thanks Markus!

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Markus Trippelsdorf
On 2013.07.12 at 15:41 +0200, Borislav Petkov wrote:
> On Fri, Jul 12, 2013 at 10:04:28AM +0200, Markus Trippelsdorf wrote:
> > Mauro said he will fix this in the coming weeks:
> > 
> > http://article.gmane.org/gmane.linux.kernel/1522719
> 
> Here's a possible fix which works fine here. Markus, if you could verify
> please...

Yes, it's working fine here, too. Thanks Boris.

-- 
Markus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Borislav Petkov
On Fri, Jul 12, 2013 at 10:57:41AM -0300, Mauro Carvalho Chehab wrote:
> This will be overriding the content of the static var mc_bus every for
> every new memory controller. Are you sure that bus.name is only used
> on register, or if its contents is stored somewhere?

bus_register does kobject_set_name which copies bus->name, for example,
but I didn't look exhaustively.

Just to be on the safe side, I should probably do a

static const char **bus_names = { "mc0", "mc1", ..., "mc7" };

and use it. Are 8 enough for your edac drivers too?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Mauro Carvalho Chehab

Em 12-07-2013 10:41, Borislav Petkov escreveu:

On Fri, Jul 12, 2013 at 10:04:28AM +0200, Markus Trippelsdorf wrote:

Mauro said he will fix this in the coming weeks:

http://article.gmane.org/gmane.linux.kernel/1522719

Here's a possible fix which works fine here. Markus, if you could verify
please...

I probably should also tag it for stable since the issue is in 3.10.
I'll leave it in -next a bit though, to have some coverage.

--
From: Borislav Petkov 
Date: Fri, 12 Jul 2013 10:53:38 +0200
Subject: [PATCH] EDAC: Fix lockdep splat

Fix the following:

BUG: key 88043bdd0330 not in .data!
[ cut here ]
WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x565/0x5a0()
DEBUG_LOCKS_WARN_ON(1)
Modules linked in: glue_helper sb_edac(+) edac_core snd acpi_cpufreq lrw 
gf128mul ablk_helper iTCO_wdt evdev i2c_i801 dcdbas button cryptd pcspkr 
iTCO_vendor_support usb_common lpc_ich mfd_core soundcore mperf processor 
microcode
CPU: 2 PID: 599 Comm: modprobe Not tainted 3.10.0 #1
Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013
  0009 880439a1d920 8160a9a9 880439a1d958
  8103d9e0 88043af4a510 81a16e11 
  88043bdd0330  880439a1d9b8 8103dacc
Call Trace:
   dump_stack
   warn_slowpath_common
   warn_slowpath_fmt
   lockdep_init_map
   ? trace_hardirqs_on_caller
   ? trace_hardirqs_on
   debug_mutex_init
   __mutex_init
   bus_register
   edac_create_sysfs_mci_device
   edac_mc_add_mc
   sbridge_probe
   pci_device_probe
   driver_probe_device
   __driver_attach
   ? driver_probe_device
   bus_for_each_dev
   driver_attach
   bus_add_driver
   driver_register
   __pci_register_driver
   ? 0xa0010fff
   sbridge_init
   ? 0xa0010fff
   do_one_initcall
   load_module
   ? unset_module_init_ro_nx
   SyS_init_module
   tracesys
---[ end trace d24a70b0d3ddf733 ]---
EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 
:3f:0e.0
EDAC sbridge: Driver loaded.

What happens is that bus_register needs a statically allocated lock_key
because it is handed in to lockdep. However, struct mem_ctl_info embeds
struct bus_type (the whole struct, not a pointer to it) which gets
dynamically allocated.

Fix this by using a statically allocated struct bus_type for the MC bus.

Cc: Mauro Carvalho Chehab 
Cc: Markus Trippelsdorf 
Signed-off-by: Borislav Petkov 
---
  drivers/edac/edac_mc.c   |  6 ++
  drivers/edac/edac_mc_sysfs.c | 28 +++-
  drivers/edac/i5100_edac.c|  2 +-
  include/linux/edac.h |  2 +-
  4 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 27e86d938262..2179f48cfe16 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -48,6 +48,10 @@ static LIST_HEAD(mc_devices);
   */
  static void const *edac_mc_owner;
  
+static struct bus_type mc_bus = {

+   .dev_name = "edac_mc",
+};
+
  unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
 unsigned len)
  {
@@ -762,6 +766,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
/* set load time so that error rate can be tracked */
mci->start_time = jiffies;
  
+	mci->bus = _bus;

+
if (edac_create_sysfs_mci_device(mci)) {
edac_mc_printk(mci, KERN_WARNING,
"failed to create sysfs device\n");
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 67610a6ebf87..c4d700a577d2 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -370,7 +370,7 @@ static int edac_create_csrow_object(struct mem_ctl_info 
*mci,
return -ENODEV;
  
  	csrow->dev.type = _attr_type;

-   csrow->dev.bus = >bus;
+   csrow->dev.bus = mci->bus;
device_initialize(>dev);
csrow->dev.parent = >dev;
csrow->mci = mci;
@@ -605,7 +605,7 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
dimm->mci = mci;
  
  	dimm->dev.type = _attr_type;

-   dimm->dev.bus = >bus;
+   dimm->dev.bus = mci->bus;
device_initialize(>dev);
  
  	dimm->dev.parent = >dev;

@@ -975,11 +975,13 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 * The memory controller needs its own bus, in order to avoid
 * namespace conflicts at /sys/bus/edac.
 */
-   mci->bus.name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
-   if (!mci->bus.name)
+   mci->bus->name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
+   if (!mci->bus->name)
return -ENOMEM;


This will be overriding the content of the static var mc_bus every for every
new memory controller. Are you sure that bus.name is only used on register,
or if its contents is stored somewhere?

Otherwise, you may have troubles at module removal and/or on other places.

Regards,
Mauro

-   edac_dbg(0, 

Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Borislav Petkov
On Fri, Jul 12, 2013 at 10:04:28AM +0200, Markus Trippelsdorf wrote:
> Mauro said he will fix this in the coming weeks:
> 
> http://article.gmane.org/gmane.linux.kernel/1522719

Here's a possible fix which works fine here. Markus, if you could verify
please...

I probably should also tag it for stable since the issue is in 3.10.
I'll leave it in -next a bit though, to have some coverage.

--
From: Borislav Petkov 
Date: Fri, 12 Jul 2013 10:53:38 +0200
Subject: [PATCH] EDAC: Fix lockdep splat

Fix the following:

BUG: key 88043bdd0330 not in .data!
[ cut here ]
WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x565/0x5a0()
DEBUG_LOCKS_WARN_ON(1)
Modules linked in: glue_helper sb_edac(+) edac_core snd acpi_cpufreq lrw 
gf128mul ablk_helper iTCO_wdt evdev i2c_i801 dcdbas button cryptd pcspkr 
iTCO_vendor_support usb_common lpc_ich mfd_core soundcore mperf processor 
microcode
CPU: 2 PID: 599 Comm: modprobe Not tainted 3.10.0 #1
Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013
 0009 880439a1d920 8160a9a9 880439a1d958
 8103d9e0 88043af4a510 81a16e11 
 88043bdd0330  880439a1d9b8 8103dacc
Call Trace:
  dump_stack
  warn_slowpath_common
  warn_slowpath_fmt
  lockdep_init_map
  ? trace_hardirqs_on_caller
  ? trace_hardirqs_on
  debug_mutex_init
  __mutex_init
  bus_register
  edac_create_sysfs_mci_device
  edac_mc_add_mc
  sbridge_probe
  pci_device_probe
  driver_probe_device
  __driver_attach
  ? driver_probe_device
  bus_for_each_dev
  driver_attach
  bus_add_driver
  driver_register
  __pci_register_driver
  ? 0xa0010fff
  sbridge_init
  ? 0xa0010fff
  do_one_initcall
  load_module
  ? unset_module_init_ro_nx
  SyS_init_module
  tracesys
---[ end trace d24a70b0d3ddf733 ]---
EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 
:3f:0e.0
EDAC sbridge: Driver loaded.

What happens is that bus_register needs a statically allocated lock_key
because it is handed in to lockdep. However, struct mem_ctl_info embeds
struct bus_type (the whole struct, not a pointer to it) which gets
dynamically allocated.

Fix this by using a statically allocated struct bus_type for the MC bus.

Cc: Mauro Carvalho Chehab 
Cc: Markus Trippelsdorf 
Signed-off-by: Borislav Petkov 
---
 drivers/edac/edac_mc.c   |  6 ++
 drivers/edac/edac_mc_sysfs.c | 28 +++-
 drivers/edac/i5100_edac.c|  2 +-
 include/linux/edac.h |  2 +-
 4 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 27e86d938262..2179f48cfe16 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -48,6 +48,10 @@ static LIST_HEAD(mc_devices);
  */
 static void const *edac_mc_owner;
 
+static struct bus_type mc_bus = {
+   .dev_name = "edac_mc",
+};
+
 unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
 unsigned len)
 {
@@ -762,6 +766,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
/* set load time so that error rate can be tracked */
mci->start_time = jiffies;
 
+   mci->bus = _bus;
+
if (edac_create_sysfs_mci_device(mci)) {
edac_mc_printk(mci, KERN_WARNING,
"failed to create sysfs device\n");
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 67610a6ebf87..c4d700a577d2 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -370,7 +370,7 @@ static int edac_create_csrow_object(struct mem_ctl_info 
*mci,
return -ENODEV;
 
csrow->dev.type = _attr_type;
-   csrow->dev.bus = >bus;
+   csrow->dev.bus = mci->bus;
device_initialize(>dev);
csrow->dev.parent = >dev;
csrow->mci = mci;
@@ -605,7 +605,7 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
dimm->mci = mci;
 
dimm->dev.type = _attr_type;
-   dimm->dev.bus = >bus;
+   dimm->dev.bus = mci->bus;
device_initialize(>dev);
 
dimm->dev.parent = >dev;
@@ -975,11 +975,13 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 * The memory controller needs its own bus, in order to avoid
 * namespace conflicts at /sys/bus/edac.
 */
-   mci->bus.name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
-   if (!mci->bus.name)
+   mci->bus->name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
+   if (!mci->bus->name)
return -ENOMEM;
-   edac_dbg(0, "creating bus %s\n", mci->bus.name);
-   err = bus_register(>bus);
+
+   edac_dbg(0, "creating bus %s\n", mci->bus->name);
+
+   err = bus_register(mci->bus);
if (err < 0)
return err;
 
@@ -988,7 +990,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
device_initialize(>dev);
 
mci->dev.parent = 

Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Markus Trippelsdorf
On 2013.07.12 at 10:19 +0800, Ming Lei wrote:
> On Mon, Jul 8, 2013 at 6:25 AM, Linda Walsh  wrote:
> > Also am seeing this for the first time:
> >
> > (don't know, but seems unlikely to be related to
> > https://patchwork.kernel.org/patch/87359/
> > Yet it is the only hit I found for the same message.
> >
> >
> > Looks like it's back to a more stable 3.9.8...
> > (*sigh*)
> >
> >
> > BUG: key 880c1148c478 not in .data!
> > [4.429474] [ cut here ]
> > [4.434236] WARNING: at kernel/lockdep.c:2987
> > lockdep_init_map+0x45e/0x490()
> > [4.441414] DEBUG_LOCKS_WARN_ON(1)
> > [4.444684] Modules linked in:
> > [4.448168] usb 1-3.2: new low-speed USB device number 3 using ehci-pci
> > [4.454975] CPU: 10 PID: 1 Comm: swapper/0 Tainted: G  I
> > 3.10.0-Isht-Van #1
> > [4.462862] Hardware name: Dell Inc. PowerEdge T610/0CX0R0, BIOS 6.3.0
> > 07/24/2012
> > [4.470475]  0009 880c13175a70 815bb279
> > 880c13175aa8
> > [4.478221]  8104641c 880c11c12130 880c1148c478
> > 
> > [4.485988]  880c11c12058 880c12386180 880c13175b08
> > 81046487
> > [4.493800] Call Trace:
> > [4.496472]  [] dump_stack+0x19/0x1b
> > [4.501776]  [] warn_slowpath_common+0x5c/0x80
> > [4.507917]  [] warn_slowpath_fmt+0x47/0x50
> > [4.513790]  [] lockdep_init_map+0x45e/0x490
> > [4.519775]  [] debug_mutex_init+0x2d/0x40
> > [4.525567]  [] __mutex_init+0x51/0x60
> > [4.531017]  [] bus_register+0x158/0x2c0
> > [4.536646]  [] edac_create_sysfs_mci_device+0x53/0x540
> 
> Looks because that bus_type of 'struct mem_ctl_info' is allocated dynamically
> instead of being kept it in .data statically.

Mauro said he will fix this in the coming weeks:

http://article.gmane.org/gmane.linux.kernel/1522719

-- 
Markus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Markus Trippelsdorf
On 2013.07.12 at 10:19 +0800, Ming Lei wrote:
 On Mon, Jul 8, 2013 at 6:25 AM, Linda Walsh l...@tlinx.org wrote:
  Also am seeing this for the first time:
 
  (don't know, but seems unlikely to be related to
  https://patchwork.kernel.org/patch/87359/
  Yet it is the only hit I found for the same message.
 
 
  Looks like it's back to a more stable 3.9.8...
  (*sigh*)
 
 
  BUG: key 880c1148c478 not in .data!
  [4.429474] [ cut here ]
  [4.434236] WARNING: at kernel/lockdep.c:2987
  lockdep_init_map+0x45e/0x490()
  [4.441414] DEBUG_LOCKS_WARN_ON(1)
  [4.444684] Modules linked in:
  [4.448168] usb 1-3.2: new low-speed USB device number 3 using ehci-pci
  [4.454975] CPU: 10 PID: 1 Comm: swapper/0 Tainted: G  I
  3.10.0-Isht-Van #1
  [4.462862] Hardware name: Dell Inc. PowerEdge T610/0CX0R0, BIOS 6.3.0
  07/24/2012
  [4.470475]  0009 880c13175a70 815bb279
  880c13175aa8
  [4.478221]  8104641c 880c11c12130 880c1148c478
  
  [4.485988]  880c11c12058 880c12386180 880c13175b08
  81046487
  [4.493800] Call Trace:
  [4.496472]  [815bb279] dump_stack+0x19/0x1b
  [4.501776]  [8104641c] warn_slowpath_common+0x5c/0x80
  [4.507917]  [81046487] warn_slowpath_fmt+0x47/0x50
  [4.513790]  [8109c1fe] lockdep_init_map+0x45e/0x490
  [4.519775]  [8109b12d] debug_mutex_init+0x2d/0x40
  [4.525567]  [8106ef61] __mutex_init+0x51/0x60
  [4.531017]  [813a1618] bus_register+0x158/0x2c0
  [4.536646]  [814c6dc3] edac_create_sysfs_mci_device+0x53/0x540
 
 Looks because that bus_type of 'struct mem_ctl_info' is allocated dynamically
 instead of being kept it in .data statically.

Mauro said he will fix this in the coming weeks:

http://article.gmane.org/gmane.linux.kernel/1522719

-- 
Markus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Borislav Petkov
On Fri, Jul 12, 2013 at 10:04:28AM +0200, Markus Trippelsdorf wrote:
 Mauro said he will fix this in the coming weeks:
 
 http://article.gmane.org/gmane.linux.kernel/1522719

Here's a possible fix which works fine here. Markus, if you could verify
please...

I probably should also tag it for stable since the issue is in 3.10.
I'll leave it in -next a bit though, to have some coverage.

--
From: Borislav Petkov b...@suse.de
Date: Fri, 12 Jul 2013 10:53:38 +0200
Subject: [PATCH] EDAC: Fix lockdep splat

Fix the following:

BUG: key 88043bdd0330 not in .data!
[ cut here ]
WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x565/0x5a0()
DEBUG_LOCKS_WARN_ON(1)
Modules linked in: glue_helper sb_edac(+) edac_core snd acpi_cpufreq lrw 
gf128mul ablk_helper iTCO_wdt evdev i2c_i801 dcdbas button cryptd pcspkr 
iTCO_vendor_support usb_common lpc_ich mfd_core soundcore mperf processor 
microcode
CPU: 2 PID: 599 Comm: modprobe Not tainted 3.10.0 #1
Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013
 0009 880439a1d920 8160a9a9 880439a1d958
 8103d9e0 88043af4a510 81a16e11 
 88043bdd0330  880439a1d9b8 8103dacc
Call Trace:
  dump_stack
  warn_slowpath_common
  warn_slowpath_fmt
  lockdep_init_map
  ? trace_hardirqs_on_caller
  ? trace_hardirqs_on
  debug_mutex_init
  __mutex_init
  bus_register
  edac_create_sysfs_mci_device
  edac_mc_add_mc
  sbridge_probe
  pci_device_probe
  driver_probe_device
  __driver_attach
  ? driver_probe_device
  bus_for_each_dev
  driver_attach
  bus_add_driver
  driver_register
  __pci_register_driver
  ? 0xa0010fff
  sbridge_init
  ? 0xa0010fff
  do_one_initcall
  load_module
  ? unset_module_init_ro_nx
  SyS_init_module
  tracesys
---[ end trace d24a70b0d3ddf733 ]---
EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 
:3f:0e.0
EDAC sbridge: Driver loaded.

What happens is that bus_register needs a statically allocated lock_key
because it is handed in to lockdep. However, struct mem_ctl_info embeds
struct bus_type (the whole struct, not a pointer to it) which gets
dynamically allocated.

Fix this by using a statically allocated struct bus_type for the MC bus.

Cc: Mauro Carvalho Chehab mche...@infradead.org
Cc: Markus Trippelsdorf mar...@trippelsdorf.de
Signed-off-by: Borislav Petkov b...@suse.de
---
 drivers/edac/edac_mc.c   |  6 ++
 drivers/edac/edac_mc_sysfs.c | 28 +++-
 drivers/edac/i5100_edac.c|  2 +-
 include/linux/edac.h |  2 +-
 4 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 27e86d938262..2179f48cfe16 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -48,6 +48,10 @@ static LIST_HEAD(mc_devices);
  */
 static void const *edac_mc_owner;
 
+static struct bus_type mc_bus = {
+   .dev_name = edac_mc,
+};
+
 unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
 unsigned len)
 {
@@ -762,6 +766,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
/* set load time so that error rate can be tracked */
mci-start_time = jiffies;
 
+   mci-bus = mc_bus;
+
if (edac_create_sysfs_mci_device(mci)) {
edac_mc_printk(mci, KERN_WARNING,
failed to create sysfs device\n);
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 67610a6ebf87..c4d700a577d2 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -370,7 +370,7 @@ static int edac_create_csrow_object(struct mem_ctl_info 
*mci,
return -ENODEV;
 
csrow-dev.type = csrow_attr_type;
-   csrow-dev.bus = mci-bus;
+   csrow-dev.bus = mci-bus;
device_initialize(csrow-dev);
csrow-dev.parent = mci-dev;
csrow-mci = mci;
@@ -605,7 +605,7 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
dimm-mci = mci;
 
dimm-dev.type = dimm_attr_type;
-   dimm-dev.bus = mci-bus;
+   dimm-dev.bus = mci-bus;
device_initialize(dimm-dev);
 
dimm-dev.parent = mci-dev;
@@ -975,11 +975,13 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 * The memory controller needs its own bus, in order to avoid
 * namespace conflicts at /sys/bus/edac.
 */
-   mci-bus.name = kasprintf(GFP_KERNEL, mc%d, mci-mc_idx);
-   if (!mci-bus.name)
+   mci-bus-name = kasprintf(GFP_KERNEL, mc%d, mci-mc_idx);
+   if (!mci-bus-name)
return -ENOMEM;
-   edac_dbg(0, creating bus %s\n, mci-bus.name);
-   err = bus_register(mci-bus);
+
+   edac_dbg(0, creating bus %s\n, mci-bus-name);
+
+   err = bus_register(mci-bus);
if (err  0)
return err;
 
@@ -988,7 +990,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
  

Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Mauro Carvalho Chehab

Em 12-07-2013 10:41, Borislav Petkov escreveu:

On Fri, Jul 12, 2013 at 10:04:28AM +0200, Markus Trippelsdorf wrote:

Mauro said he will fix this in the coming weeks:

http://article.gmane.org/gmane.linux.kernel/1522719

Here's a possible fix which works fine here. Markus, if you could verify
please...

I probably should also tag it for stable since the issue is in 3.10.
I'll leave it in -next a bit though, to have some coverage.

--
From: Borislav Petkov b...@suse.de
Date: Fri, 12 Jul 2013 10:53:38 +0200
Subject: [PATCH] EDAC: Fix lockdep splat

Fix the following:

BUG: key 88043bdd0330 not in .data!
[ cut here ]
WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x565/0x5a0()
DEBUG_LOCKS_WARN_ON(1)
Modules linked in: glue_helper sb_edac(+) edac_core snd acpi_cpufreq lrw 
gf128mul ablk_helper iTCO_wdt evdev i2c_i801 dcdbas button cryptd pcspkr 
iTCO_vendor_support usb_common lpc_ich mfd_core soundcore mperf processor 
microcode
CPU: 2 PID: 599 Comm: modprobe Not tainted 3.10.0 #1
Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013
  0009 880439a1d920 8160a9a9 880439a1d958
  8103d9e0 88043af4a510 81a16e11 
  88043bdd0330  880439a1d9b8 8103dacc
Call Trace:
   dump_stack
   warn_slowpath_common
   warn_slowpath_fmt
   lockdep_init_map
   ? trace_hardirqs_on_caller
   ? trace_hardirqs_on
   debug_mutex_init
   __mutex_init
   bus_register
   edac_create_sysfs_mci_device
   edac_mc_add_mc
   sbridge_probe
   pci_device_probe
   driver_probe_device
   __driver_attach
   ? driver_probe_device
   bus_for_each_dev
   driver_attach
   bus_add_driver
   driver_register
   __pci_register_driver
   ? 0xa0010fff
   sbridge_init
   ? 0xa0010fff
   do_one_initcall
   load_module
   ? unset_module_init_ro_nx
   SyS_init_module
   tracesys
---[ end trace d24a70b0d3ddf733 ]---
EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 
:3f:0e.0
EDAC sbridge: Driver loaded.

What happens is that bus_register needs a statically allocated lock_key
because it is handed in to lockdep. However, struct mem_ctl_info embeds
struct bus_type (the whole struct, not a pointer to it) which gets
dynamically allocated.

Fix this by using a statically allocated struct bus_type for the MC bus.

Cc: Mauro Carvalho Chehab mche...@infradead.org
Cc: Markus Trippelsdorf mar...@trippelsdorf.de
Signed-off-by: Borislav Petkov b...@suse.de
---
  drivers/edac/edac_mc.c   |  6 ++
  drivers/edac/edac_mc_sysfs.c | 28 +++-
  drivers/edac/i5100_edac.c|  2 +-
  include/linux/edac.h |  2 +-
  4 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 27e86d938262..2179f48cfe16 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -48,6 +48,10 @@ static LIST_HEAD(mc_devices);
   */
  static void const *edac_mc_owner;
  
+static struct bus_type mc_bus = {

+   .dev_name = edac_mc,
+};
+
  unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
 unsigned len)
  {
@@ -762,6 +766,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
/* set load time so that error rate can be tracked */
mci-start_time = jiffies;
  
+	mci-bus = mc_bus;

+
if (edac_create_sysfs_mci_device(mci)) {
edac_mc_printk(mci, KERN_WARNING,
failed to create sysfs device\n);
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 67610a6ebf87..c4d700a577d2 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -370,7 +370,7 @@ static int edac_create_csrow_object(struct mem_ctl_info 
*mci,
return -ENODEV;
  
  	csrow-dev.type = csrow_attr_type;

-   csrow-dev.bus = mci-bus;
+   csrow-dev.bus = mci-bus;
device_initialize(csrow-dev);
csrow-dev.parent = mci-dev;
csrow-mci = mci;
@@ -605,7 +605,7 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
dimm-mci = mci;
  
  	dimm-dev.type = dimm_attr_type;

-   dimm-dev.bus = mci-bus;
+   dimm-dev.bus = mci-bus;
device_initialize(dimm-dev);
  
  	dimm-dev.parent = mci-dev;

@@ -975,11 +975,13 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 * The memory controller needs its own bus, in order to avoid
 * namespace conflicts at /sys/bus/edac.
 */
-   mci-bus.name = kasprintf(GFP_KERNEL, mc%d, mci-mc_idx);
-   if (!mci-bus.name)
+   mci-bus-name = kasprintf(GFP_KERNEL, mc%d, mci-mc_idx);
+   if (!mci-bus-name)
return -ENOMEM;


This will be overriding the content of the static var mc_bus every for every
new memory controller. Are you sure that bus.name is only used on register,
or if its contents is stored somewhere?

Otherwise, you may have troubles at module 

Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Borislav Petkov
On Fri, Jul 12, 2013 at 10:57:41AM -0300, Mauro Carvalho Chehab wrote:
 This will be overriding the content of the static var mc_bus every for
 every new memory controller. Are you sure that bus.name is only used
 on register, or if its contents is stored somewhere?

bus_register does kobject_set_name which copies bus-name, for example,
but I didn't look exhaustively.

Just to be on the safe side, I should probably do a

static const char **bus_names = { mc0, mc1, ..., mc7 };

and use it. Are 8 enough for your edac drivers too?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Markus Trippelsdorf
On 2013.07.12 at 15:41 +0200, Borislav Petkov wrote:
 On Fri, Jul 12, 2013 at 10:04:28AM +0200, Markus Trippelsdorf wrote:
  Mauro said he will fix this in the coming weeks:
  
  http://article.gmane.org/gmane.linux.kernel/1522719
 
 Here's a possible fix which works fine here. Markus, if you could verify
 please...

Yes, it's working fine here, too. Thanks Boris.

-- 
Markus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Borislav Petkov
On Fri, Jul 12, 2013 at 04:28:44PM +0200, Markus Trippelsdorf wrote:
 Yes, it's working fine here, too. Thanks Boris.

Thanks Markus!

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Mauro Carvalho Chehab
Em Fri, 12 Jul 2013 16:21:06 +0200
Borislav Petkov b...@alien8.de escreveu:

 On Fri, Jul 12, 2013 at 10:57:41AM -0300, Mauro Carvalho Chehab wrote:
  This will be overriding the content of the static var mc_bus every for
  every new memory controller. Are you sure that bus.name is only used
  on register, or if its contents is stored somewhere?
 
 bus_register does kobject_set_name which copies bus-name, for example,

Ok, so, it could be safe.

 but I didn't look exhaustively.

Did you try to remove and reinsert the edac driver a few times, on a
multi-memory controller machine? The bus nodes got created properly?
 
 Just to be on the safe side, I should probably do a
 
 static const char **bus_names = { mc0, mc1, ..., mc7 };

You would likely to use an array for the bus_type too, if reusing
the static one is an issue.

 and use it. Are 8 enough for your edac drivers too?

With edac_ghes, I suspect that the worse case, on Intel side, is the
Nehalem/Sandy Bridge/Ivy Bridge EX machines.

Tony,

What would be a reasonable maximum limit for the number of memory
controllers, on a -EX machine?

Cheers,
Mauro
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Luck, Tony
 What would be a reasonable maximum limit for the number of memory
 controllers, on a -EX machine?

Westmere-EX has one memory controller per socket ... and there are glueless 
systems up to 8 sockets.  So 8 there. Not sure if any OEM is building larger 
machines with a node controller (SGI? Not sure if they build their behemoths 
from -EP or -EX parts).

Ivy Bridge ups the ante with two memory controllers on a socket. So plan on 
doubling soon.

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-11 Thread Ming Lei
On Mon, Jul 8, 2013 at 6:25 AM, Linda Walsh  wrote:
> Also am seeing this for the first time:
>
> (don't know, but seems unlikely to be related to
> https://patchwork.kernel.org/patch/87359/
> Yet it is the only hit I found for the same message.
>
>
> Looks like it's back to a more stable 3.9.8...
> (*sigh*)
>
>
> BUG: key 880c1148c478 not in .data!
> [4.429474] [ cut here ]
> [4.434236] WARNING: at kernel/lockdep.c:2987
> lockdep_init_map+0x45e/0x490()
> [4.441414] DEBUG_LOCKS_WARN_ON(1)
> [4.444684] Modules linked in:
> [4.448168] usb 1-3.2: new low-speed USB device number 3 using ehci-pci
> [4.454975] CPU: 10 PID: 1 Comm: swapper/0 Tainted: G  I
> 3.10.0-Isht-Van #1
> [4.462862] Hardware name: Dell Inc. PowerEdge T610/0CX0R0, BIOS 6.3.0
> 07/24/2012
> [4.470475]  0009 880c13175a70 815bb279
> 880c13175aa8
> [4.478221]  8104641c 880c11c12130 880c1148c478
> 
> [4.485988]  880c11c12058 880c12386180 880c13175b08
> 81046487
> [4.493800] Call Trace:
> [4.496472]  [] dump_stack+0x19/0x1b
> [4.501776]  [] warn_slowpath_common+0x5c/0x80
> [4.507917]  [] warn_slowpath_fmt+0x47/0x50
> [4.513790]  [] lockdep_init_map+0x45e/0x490
> [4.519775]  [] debug_mutex_init+0x2d/0x40
> [4.525567]  [] __mutex_init+0x51/0x60
> [4.531017]  [] bus_register+0x158/0x2c0
> [4.536646]  [] edac_create_sysfs_mci_device+0x53/0x540

Looks because that bus_type of 'struct mem_ctl_info' is allocated dynamically
instead of being kept it in .data statically.

Thanks,
-- 
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-11 Thread Ming Lei
On Mon, Jul 8, 2013 at 6:25 AM, Linda Walsh l...@tlinx.org wrote:
 Also am seeing this for the first time:

 (don't know, but seems unlikely to be related to
 https://patchwork.kernel.org/patch/87359/
 Yet it is the only hit I found for the same message.


 Looks like it's back to a more stable 3.9.8...
 (*sigh*)


 BUG: key 880c1148c478 not in .data!
 [4.429474] [ cut here ]
 [4.434236] WARNING: at kernel/lockdep.c:2987
 lockdep_init_map+0x45e/0x490()
 [4.441414] DEBUG_LOCKS_WARN_ON(1)
 [4.444684] Modules linked in:
 [4.448168] usb 1-3.2: new low-speed USB device number 3 using ehci-pci
 [4.454975] CPU: 10 PID: 1 Comm: swapper/0 Tainted: G  I
 3.10.0-Isht-Van #1
 [4.462862] Hardware name: Dell Inc. PowerEdge T610/0CX0R0, BIOS 6.3.0
 07/24/2012
 [4.470475]  0009 880c13175a70 815bb279
 880c13175aa8
 [4.478221]  8104641c 880c11c12130 880c1148c478
 
 [4.485988]  880c11c12058 880c12386180 880c13175b08
 81046487
 [4.493800] Call Trace:
 [4.496472]  [815bb279] dump_stack+0x19/0x1b
 [4.501776]  [8104641c] warn_slowpath_common+0x5c/0x80
 [4.507917]  [81046487] warn_slowpath_fmt+0x47/0x50
 [4.513790]  [8109c1fe] lockdep_init_map+0x45e/0x490
 [4.519775]  [8109b12d] debug_mutex_init+0x2d/0x40
 [4.525567]  [8106ef61] __mutex_init+0x51/0x60
 [4.531017]  [813a1618] bus_register+0x158/0x2c0
 [4.536646]  [814c6dc3] edac_create_sysfs_mci_device+0x53/0x540

Looks because that bus_type of 'struct mem_ctl_info' is allocated dynamically
instead of being kept it in .data statically.

Thanks,
-- 
Ming Lei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/