hwloc (Hardware Locality) 2.5.0rc1 is now available for download.

        https://www.open-mpi.org/software/hwloc/v2.5/

v2.5.0 is a fairly big release with 3 major topics:
* NVIDIA NVLink and AMD XGMI bandwidth matrices between GPUs.
  The distances API has been extended to make them convenient
  to consult but we may add more helpers based user feedback.
* A new "levelzero" backend for Intel oneAPI L0 devices.
  GPU distances will be added in the future.
* New topology flags to mitigate spurious binding changes
  during hwloc discovery on Windows.

The following is a summary of the changes since v2.5.0.

Version 2.5.0
-------------
* API
  + Add hwloc/windows.h to query Windows processor groups.
  + Add hwloc_get_obj_with_same_locality() to convert between objects
    with same locality, for instance NUMA nodes and Packages,
    or OS devices within a PCI device.
  + Add hwloc_distances_transform() to modify distances structures.
    - hwloc-annotate and lstopo have new distances-transform options.
  + hwloc_distances_add() is replaced with _add_create() followed by
    _add_values() and _add_commit(). See hwloc/distances.h for details.
  + Add topology flags to mitigate binding modifications during
    hwloc discovery, especially on Windows:
    - HWLOC_TOPOLOGY_FLAG_RESTRICT_TO_CPUBINDING and _MEMBINDING
      restrict discovery to PUs and NUMA nodes inside the binding.
    - HWLOC_TOPOLOGY_FLAG_DONT_CHANGE_BINDING prevents from ever
      changing the binding during discovery.
* Backends
  + Add a levelzero backend for oneAPI L0 devices, exposed as OS devices
    of subtype "LevelZero" and name such as "ze0".
    - Add hwloc/levelzero.h for interoperability between converting
      between L0 API devices and hwloc cpusets or OS devices.
  + Expose NEC Vector Engine cards on Linux as OS devices of subtype
    "VectorEngine" and name "ve0", etc.
    Thanks to Anara Kozhokanova, Tim Cramer and Erich Focht for the help.
  + Add a NVLinkBandwidth distances structure between NVIDIA GPUs
    (and POWER processor or NVSwitches) in the NVML backend,
    and a XGMIBandwidth distances structure between AMD GPUs
    in the RSMI backends.
    - See "Topology Attributes: Distances, Memory Attributes and CPU Kinds"
      in the documentation for details about these new distances.
  + Add support for NUMA node 0 being offline in Linux, thanks to Jirka Hladky.
* Build
  + Add --with-cuda-version=<version> or look at the CUDA_VERSION
    environment variable to find the appropriate CUDA pkg-config files.
    Thanks to Stephen Herbein for the suggestion.
    - Also add --with-cuda=<dir> to specify the CUDA installation path
      manually (and its NVML and OpenCL components).
      Thanks to Andrea Bocci for the suggestion.
    - See "How do I enable CUDA and select which CUDA version to use?"
      in the FAQ for details.
* Tools
  + lstopo now has a --windows-processor-groups option on Windows.
  + hwloc-ps now has a --short-name option to avoid long/truncated
    command path.
  + hwloc-ps now has a --single-ancestor option to return a single
    (possibly too large) object where a process is bound.
  + hwloc-ps --pid-cmd may now query environment variables,
    including MPI-specific variables to find out process ranks.

--
Brice

_______________________________________________
hwloc-announce mailing list
hwloc-announce@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-announce

Reply via email to