What Brice is saying is that hwloc is saying that your cores all have
individual caches -- they're not shared. Have a look at a graphical hwloc
output to see:
lstopo mymachine.png
On Nov 7, 2012, at 7:17 PM, Brice Goglin wrote:
> What processor and kernel is this? (see /proc/cpuinfo, or ru
On Nov 7, 2012, at 7:19 PM, Brice Goglin wrote:
> * hwloc does everything libnuma does, but it does a lot more (everything
> that isn't related to NUMA)
Here's my 1-line description:
libnuma is old bustedness; hwloc is new hotness.
:-)
--
Jeff Squyres
jsquy...@cisco.com
For corporate lega
I am replying to my own post, since no one else replied.
With the help of MVAPICH2 developer S. Potluri the problem was isolated and
fixed. It was, as expected, due to the library not intercepting
the cudaHostAlloc() and cudaFreeHost() calls to register pinned memory, as
would be required for th
Le 07/11/2012 21:26, Jeff Squyres a écrit :
> On Nov 7, 2012, at 1:33 PM, Blosch, Edwin L wrote:
>
>> I see hwloc is a subproject hosted under OpenMPI but, in reading the
>> documentation, I was unable to figure out if hwloc is a module within
>> OpenMPI, or if some of the code base is borrowed i
What processor and kernel is this? (see /proc/cpuinfo, or run "lstopo
-v" and look for attributes on the Socket line)
You're hwloc output looks like an Intel Xeon Westmere-EX (E7-48xx or
E7-88xx).
The likwid output is likely wrong (maybe confused by the fact that
hardware threads are disabled).
Br
>>> In your desired ordering you have rank 0 on (socket,core) (0,0) and
>>> rank 1 on (0,2). Is there an architectural reason for that? Meaning
>>> are cores 0 and 1 hardware threads in the same core, or is there a
>>> cache level (say L2 or L3) connecting cores 0 and 1 separate from
>>> cores
On Nov 7, 2012, at 1:33 PM, Blosch, Edwin L wrote:
> I see hwloc is a subproject hosted under OpenMPI but, in reading the
> documentation, I was unable to figure out if hwloc is a module within
> OpenMPI, or if some of the code base is borrowed into OpenMPI, or something
> else. Is hwloc used
In your desired ordering you have rank 0 on (socket,core) (0,0) and
rank 1 on (0,2). Is there an architectural reason for that? Meaning
are cores 0 and 1 hardware threads in the same core, or is there a
cache level (say L2 or L3) connecting cores 0 and 1 separate from
cores 2 and 3?
hwloc's lstopo
I see hwloc is a subproject hosted under OpenMPI but, in reading the
documentation, I was unable to figure out if hwloc is a module within OpenMPI,
or if some of the code base is borrowed into OpenMPI, or something else. Is
hwloc used by OpenMPI internally? Is it a layer above libnuma? Or is
I am trying to map MPI processes to sockets in a somewhat compacted pattern and
I am wondering the best way to do it.
Say there are 2 sockets (0 and 1) and each processor has 4 cores (0,1,2,3) and
I have 4 MPI processes, each of which will use 2 OpenMP processes.
I've re-ordered my parallel wor
I am using this parameter "shmem_mmap_relocate_backing_file" and noticed that
the relocation variable is identified as
"shmem_mmap_opal_shmem_mmap_backing_file_base_dir" in its documentation, but
then the next parameter that appears from ompi_info is spelled differently,
namely "shmem_mmap_back
Try one of these:
http://www.scl.ameslab.gov/netpipe/
http://mvapich.cse.ohio-state.edu/benchmarks/osu-micro-benchmarks-3.7.tar.gz
george.
On Nov 7, 2012, at 00:30 , huydanlin wrote:
> Hi,
>Have anyone know about MPI Program use to measure communication round-trip
> time on Cluster (
Yes, we definitely should do so - will put it on the Trac system so it gets
done.
Thanks - and sorry it wasn't already there.
On Wed, Nov 7, 2012 at 4:49 AM, Iliev, Hristo wrote:
> Hello, Markus,
>
> The openib BTL component is not thread-safe. It disables itself when the
> thread support leve
Hello, Markus,
The openib BTL component is not thread-safe. It disables itself when the
thread support level is MPI_THREAD_MULTIPLE. See this rant from one of my
colleagues:
http://www.open-mpi.org/community/lists/devel/2012/10/11584.php
A message is shown but only if the library was compiled wi
Hello,
I've compiled Open MPI 1.6.3 with --enable-mpi-thread-multiple -with-tm
-with-openib --enable-opal-multi-threads.
When I use for example the pingpong benchmark from the Intel MPI
Benchmarks, which call MPI_Init the btl openib is used and everything
works fine.
When instead the benchmark c
Hello Adam,
> I was able to build successfully by manually substituting the correct
location into the Makefile in question.
Another, more convenient workaround would be to add the following option to
the Open MPI configure command:
--with-contrib-vt-flags="--with-cuda-dir=$CUDA_HOME"
The
Hi,
Have anyone know about MPI Program use to measure communication
round-trip time on Cluster ( like ping command on network)
That is the program have each process run on each node on Cluster. Then
it use MPI to calculate the communication round-trip time among nodes.
Thanks
17 matches
Mail list logo