Hello
This is a widespread problem with AMD machines. Buggy platform reporting
invalid L3 cache information in this case. Upgrading the BIOS may help.
Anyway, I guess Slurm doesn't care much about L3 cache affinity, so you
can ignore the error by setting HWLOC_HIDE_ERRORS=1 in the environment.
More details also
http://www.open-mpi.org/projects/hwloc/doc/v1.10.0/a00028.php#faq_os_error
Brice


Le 17/01/2015 20:41, Joseph Mingrone a écrit :
> Hello,
>
> Here is the error message we see when staring slurmd or running
> hwloc-info.
>
> ****************************************************************************
> * hwloc has encountered what looks like an error from the operating system.
> *
> * L3 (P#6 cpuset 0x000003f0) intersects with NUMANode (P#0 cpuset 0x0000003f) 
> without inclusion!
> * Error occurred in topology.c line 940
> *
> * Please report this error message to the hwloc user's mailing list,
> * along with any relevant topology information from your platform.
> ****************************************************************************
> depth 0:        1 Machine (type #1)
>  depth 1:       4 Socket (type #3)
>   depth 2:      8 NUMANode (type #2)
>    depth 3:     8 L3Cache (type #4)
>     depth 4:    24 L2Cache (type #4)
>      depth 5:   24 L1iCache (type #4)
>       depth 6:  48 L1dCache (type #4)
>        depth 7: 48 Core (type #5)
>         depth 8:        48 PU (type #6)
>
> This is a system with four 12-core 6348 AMD CPUs.
>
> Other nodes with older AMD CPUs also running FreeBSD 10.1 don't report
> the error.
>
> If there is any other information I can provide, please let me know.
>
> Joseph
> _______________________________________________
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
> Link to this post: 
> http://www.open-mpi.org/community/lists/hwloc-users/2015/01/1150.php

Reply via email to