Good start! Looking forward to interactions between hwloc and
MPI_Graph_dist (and extending /extending these down to device IO,
memories and processors on the PCIe bus), does anyone envision
completeness of this intersection within MPI (esp. OpenMPI) to the
computational network graph structure?
It is documented in
http://developer.download.nvidia.com/compute/cuda/4_0/docs/GPUDirect_Technol
ogy_Overview.pdf
set CUDA_NIC_INTEROP=1
From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On
Behalf Of Sebastian Rinke
Sent: Wednesday, January 18, 2012 8:15 AM
To: Open MPI
Rolf,
I'm still experimenting with cuda-rdma-2 on CUDA 4.1 ...
I'll build up cuda-rdma-3 and see what performance changes result.
Ken Lloyd
On Fri, 2011-12-09 at 11:45 -0800, Rolf vandeVaart wrote:
> WHAT: Add new sm BTL, and supporting mpools, that can also support
> CUDA RDMA.
>
>
>
>
That makes sense to me.
-Original Message-
From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On
Behalf Of Nathan T. Hjelm
Sent: Tuesday, November 08, 2011 8:36 AM
To: Open MPI Developers
Subject: Re: [OMPI devel] Remote key sizes
On Tue, 8 Nov 2011 06:36:03 -0800,
According to http://gcc.gnu.org/onlinedocs/cpp/If.html
"The `#if' directive allows you to test the value of an arithmetic
expression, rather than the mere existence of one macro."
Is the objective to test for the existence of the macro, its value, or its
value IFF it exists?
Ken Lloyd
Nadia,
Interesting. I haven't tried pushing this to levels above 8 on a particular
machine. Do you think that the cpuset / paffinity / hwloc only applies at
the machine level, at which time you need to employ a graph with carto?
Regards,
Ken
-Original Message-
From:
ct accompanying man page.
HTH
Ralph
On Oct 12, 2010, at 9:15 AM, Kenneth Lloyd wrote:
> Ralph,
>
> There is really no need to do anything different to accommodate us
"oddball"
> cases. Continue to "do what you do".
>
> Ken
>
> -Original Message-
version for
everyone else. I'll see if I can keep a single version, though, assuming the
code doesn't get too convoluted so as to become unmaintainable.
Otherwise, I'll branch it and "freeze" a non-threaded version for the
unusual case.
Thanks!
On Oct 12, 2010, at 8:51 AM, Kenneth Llo
, assuming the
code doesn't get too convoluted so as to become unmaintainable.
Otherwise, I'll branch it and "freeze" a non-threaded version for the
unusual case.
Thanks!
On Oct 12, 2010, at 8:51 AM, Kenneth Lloyd wrote:
> In certain hybrid, heterogeneous HPC configurations, mpirun
In certain hybrid, heterogeneous HPC configurations, mpirun often cannot or
should not be threaded through the OS under which OpenMPI runs. The primary
OS and MPI can configure management nodes and topologies (even other MPI
layers) that subsequently spawn various OSes and other lightweight
I would support making hwloc a first class element (for what it's worth, and
ompi/hwloc makes sense).
The INRIA paper is interesting and insightful but incomplete. It is however
consistent some of our findings. The NUMA computational fabrics for various
codes / data combinations may be learned
it here:
http://hal.archives-ouvertes.fr/inria-00486178/en/
On Sep 22, 2010, at 11:53 AM, Kenneth Lloyd wrote:
> Jeff,
>
> Is that EuroMPI2010 ob1 paper publicly available? I get involved in various
> NUMA partitioning/architecting studies and it seems there is not a lot of
> -Original Message-
> From: devel-boun...@open-mpi.org
> [mailto:devel-boun...@open-mpi.org] On Behalf Of Jeff Squyres
> Sent: Tuesday, December 15, 2009 6:32 PM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] carto vs. hwloc
>
> On Dec 15, 2009, at 2:20 PM, Ralph Castain
www.wattsys.com
http://www.linkedin.com/pub/kenneth-lloyd/7/9a/824
http://kenscomplex.blogspot.com/
This e-mail is covered by the Electronic Communications Privacy Act, 18
U.S.C. 2510-2521
and is intended only for the addressee named above. It may contain
privileged or confidential information
Luigi,
I tried a configuration on a small test cluster similar to your Approach 1,
with interesting (promising) results. While the topology is deterministic, I
found the actual performance is under-determined in practice - depending on
the symmetry and partitioning of the tasks and the data.
I agree with Terry and Eugene, but now what are we going to do about it?
This is a potentially very powerful feature.
Ken
> -Original Message-
> From: devel-boun...@open-mpi.org
> [mailto:devel-boun...@open-mpi.org] On Behalf Of Terry Dontje
> Sent: Tuesday, October 13, 2009 7:08 AM
>
Ralph, and all,
The Japanese have a term poka-yoke which means "fail-safing". This is an
excellent concept to apply. The term does not mean covering all unintended
consequences of error and omission, though.
If folks are downloading OMPI (or any software) for unauthorized purposes,
that seems
In some of the experiments I've run and studied on exclusive binding to
specific cores, the performance metrics (which have yielded both excellent
gains as well as phases of reduced performance) have depended upon the
nature of the experiment being run (a task partitioning problem) and how the
Hi all,
I've just recently joined because I, too, am working on a (possible) OpenMPI
component. The basics are: A Topology and Weight Evolving Artificial Neural
Network (TWEANN) that learns to configure point-to-point comm. on ob1
fabrics in various compute clusters given a context of existing
19 matches
Mail list logo