Opteron 6100-series "Magny-Cours" [Was: [HEADSUP] new x86 smp topology detection code]

2016-04-07 Thread Andriy Gapon

If anyone uses FreeBSD head on a system with "Magny-Cours" CPU(s), could you
please test the following patch?
http://dpaste.com/0XRSXZB.txt
I am interested in kern.sched.topology_spec sysctl before and after the patch.
Also, lines containing "ID shift" from a dmesg of a verbose boot (before and 
after).

Thank you!

On 04/04/2016 19:31, Andriy Gapon wrote:
> 
> 
> I've just committed new code for detecting SMP (processor and cache) topology 
> on
> x86 systems.  Please be aware.
> 
> If you get any panics or crashes that look like they might be caused by this
> change please send a copy of a report to me.
> 
> Another thing to watch is kern.sched.topology_spec.
> Please check if the reported topology reasonably matches what you expect on 
> your
> system.
> You can install hwloc package (devel/hwloc) and then run lstopo -p --no-io to
> double-check the topology (--output-format ascii would produce a nice 
> ASCII-art
> diagram).
> 
> I hope that you see only improvements :-)
> 
>  Forwarded Message 
> Subject: svn commit: r297558 - in head/sys: kern sys x86/x86
> Date: Mon, 4 Apr 2016 16:09:29 + (UTC)
> From: Andriy Gapon 
> To: src-committ...@freebsd.org, svn-src-...@freebsd.org, 
> svn-src-h...@freebsd.org
> 
> Author: avg
> Date: Mon Apr  4 16:09:29 2016
> New Revision: 297558
> URL: https://svnweb.freebsd.org/changeset/base/297558
> 
> Log:
>   new x86 smp topology detection code
> 
>   Previously, the code determined a topology of processing units
>   (hardware threads, cores, packages) and then deduced a cache topology
>   using certain assumptions.  The new code builds a topology that
>   includes both processing units and caches using the information
>   provided by the hardware.
> 
>   At the moment, the discovered full topology is used only to creeate
>   a scheduling topology for SCHED_ULE.
>   There is no KPI for other kernel uses.
> 
>   Summary:
>   - based on APIC ID derivation rules for Intel and AMD CPUs
>   - can handle non-uniform topologies
>   - requires homogeneous APIC ID assignment (same bit widths for ID
> components)
>   - topology for dual-node AMD CPUs may not be optimal
>   - topology for latest AMD CPU models may not be optimal as the code is
> several years old
>   - supports only thread/package/core/cache nodes
> 
>   Todo:
> - AMD dual-node processors
> - latest AMD processors
> - NUMA nodes
> - checking for homogeneity of the APIC ID assignment across packages
> - more flexible cache placement within topology
> - expose topology to userland, e.g., via sysctl nodes
> 
>   Long term todo:
> - KPI for CPU sharing and affinity with respect to various resources
>   (e.g., two logical processors may share the same FPU, etc)
> 
>   Reviewed by:mav
>   Tested by:  mav
>   MFC after:  1 month
>   Differential Revision:  https://reviews.freebsd.org/D2728
> 
> Modified:
>   head/sys/kern/subr_smp.c
>   head/sys/sys/smp.h
>   head/sys/x86/x86/mp_x86.c

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


[HEADSUP] new x86 smp topology detection code

2016-04-04 Thread Andriy Gapon


I've just committed new code for detecting SMP (processor and cache) topology on
x86 systems.  Please be aware.

If you get any panics or crashes that look like they might be caused by this
change please send a copy of a report to me.

Another thing to watch is kern.sched.topology_spec.
Please check if the reported topology reasonably matches what you expect on your
system.
You can install hwloc package (devel/hwloc) and then run lstopo -p --no-io to
double-check the topology (--output-format ascii would produce a nice ASCII-art
diagram).

I hope that you see only improvements :-)

 Forwarded Message 
Subject: svn commit: r297558 - in head/sys: kern sys x86/x86
Date: Mon, 4 Apr 2016 16:09:29 + (UTC)
From: Andriy Gapon 
To: src-committ...@freebsd.org, svn-src-...@freebsd.org, 
svn-src-h...@freebsd.org

Author: avg
Date: Mon Apr  4 16:09:29 2016
New Revision: 297558
URL: https://svnweb.freebsd.org/changeset/base/297558

Log:
  new x86 smp topology detection code

  Previously, the code determined a topology of processing units
  (hardware threads, cores, packages) and then deduced a cache topology
  using certain assumptions.  The new code builds a topology that
  includes both processing units and caches using the information
  provided by the hardware.

  At the moment, the discovered full topology is used only to creeate
  a scheduling topology for SCHED_ULE.
  There is no KPI for other kernel uses.

  Summary:
  - based on APIC ID derivation rules for Intel and AMD CPUs
  - can handle non-uniform topologies
  - requires homogeneous APIC ID assignment (same bit widths for ID
components)
  - topology for dual-node AMD CPUs may not be optimal
  - topology for latest AMD CPU models may not be optimal as the code is
several years old
  - supports only thread/package/core/cache nodes

  Todo:
- AMD dual-node processors
- latest AMD processors
- NUMA nodes
- checking for homogeneity of the APIC ID assignment across packages
- more flexible cache placement within topology
- expose topology to userland, e.g., via sysctl nodes

  Long term todo:
- KPI for CPU sharing and affinity with respect to various resources
  (e.g., two logical processors may share the same FPU, etc)

  Reviewed by:  mav
  Tested by:mav
  MFC after:1 month
  Differential Revision:https://reviews.freebsd.org/D2728

Modified:
  head/sys/kern/subr_smp.c
  head/sys/sys/smp.h
  head/sys/x86/x86/mp_x86.c

Modified: head/sys/kern/subr_smp.c
==
--- head/sys/kern/subr_smp.cMon Apr  4 15:56:14 2016(r297557)
+++ head/sys/kern/subr_smp.cMon Apr  4 16:09:29 2016(r297558)
@@ -39,6 +39,7 @@ __FBSDID("$FreeBSD$");
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -51,6 +52,10 @@ __FBSDID("$FreeBSD$");
 #include "opt_sched.h"

 #ifdef SMP
+MALLOC_DEFINE(M_TOPO, "toponodes", "SMP topology data");
+#endif
+
+#ifdef SMP
 volatile cpuset_t stopped_cpus;
 volatile cpuset_t started_cpus;
 volatile cpuset_t suspended_cpus;
@@ -556,7 +561,7 @@ smp_rendezvous(void (* setup_func)(void
smp_rendezvous_cpus(all_cpus, setup_func, action_func, teardown_func, 
arg);
 }

-static struct cpu_group group[MAXCPU];
+static struct cpu_group group[MAXCPU * MAX_CACHE_LEVELS + 1];

 struct cpu_group *
 smp_topo(void)
@@ -616,6 +621,17 @@ smp_topo(void)
 }

 struct cpu_group *
+smp_topo_alloc(u_int count)
+{
+   static u_int index;
+   u_int curr;
+
+   curr = index;
+   index += count;
+   return ([curr]);
+}
+
+struct cpu_group *
 smp_topo_none(void)
 {
struct cpu_group *top;
@@ -861,3 +877,233 @@ sysctl_kern_smp_active(SYSCTL_HANDLER_AR
return (error);
 }

+
+#ifdef SMP
+void
+topo_init_node(struct topo_node *node)
+{
+
+   bzero(node, sizeof(*node));
+   TAILQ_INIT(>children);
+}
+
+void
+topo_init_root(struct topo_node *root)
+{
+
+   topo_init_node(root);
+   root->type = TOPO_TYPE_SYSTEM;
+}
+
+struct topo_node *
+topo_add_node_by_hwid(struct topo_node *parent, int hwid,
+topo_node_type type, uintptr_t subtype)
+{
+   struct topo_node *node;
+
+   TAILQ_FOREACH_REVERSE(node, >children,
+   topo_children, siblings) {
+   if (node->hwid == hwid
+   && node->type == type && node->subtype == subtype) {
+   return (node);
+   }
+   }
+
+   node = malloc(sizeof(*node), M_TOPO, M_WAITOK);
+   topo_init_node(node);
+   node->parent = parent;
+   node->hwid = hwid;
+   node->type = type;
+   node->subtype = subtype;
+   TAILQ_INSERT_TAIL(>children, node, siblings);
+   parent->nchildren++;
+
+   return (node);
+}
+
+struct topo_node *
+topo_find_node_by_hwid(struct topo_node *parent, int hwid,
+topo_node_type type, uintptr_t subtype)
+{
+
+   struct topo_node *node;
+
+