Richard Graham wrote:
Re: [OMPI devel] shared-memory allocations
It does not make a difference who allocates
it, what makes a difference is who touches it first.
Fair enough, but the process that allocates it right away starts to
initialize it. So, each circular buffer is set up
To expand slightly on Patrick's last comment:
> Cache prefetching is slightly
> more efficient on local socket, so closer to reader may be a bit better.
Ideally one polls from cache, but in the event that the line is evicted the
next poll after the eviction will pay a lower cost if the memory
Richard Graham wrote:
Yes - it is polling volatile memory, so has to load from memory on every
read.
Actually, it will poll in cache, and only load from memory when the
cache coherency protocol invalidates the cache line. Volatile semantic
only prevents compiler optimizations.
It does not m
>> >
>
> On 12/12/08 8:21 PM, "Eugene Loh" wrote:
>
> Richard Graham wrote:
> Re: [OMPI devel] shared-memory allocations The memory allocation is intended
to take into account that two separate procs may be touching the same memory, so
the intent is to red
Richard Graham wrote:
Re: [OMPI devel] shared-memory allocations
The memory allocation is intended to take
into account that two separate procs may be touching the same memory,
so the intent is to reduce cache conflicts (false sharing)
Got it. I'm totally fine with that. Sep
It has been a long time since I wrote the original code, and things have
changed a fair amount since that time, so bear this in mind.
The memory allocation is intended to take into account that two separate
procs may be touching the same memory, so the intent is to reduce cache
conflicts (false sh
On Dec 10, 2008, at 1:11 PM, Eugene Loh wrote:
For shared memory communications, each on-node connection (non-self,
sender-receiver pair) gets a circular buffer during MPI_Init().
Each CB requires the following allocations:
*) ompi_cb_fifo_wrapper_t (roughly 64 bytes)
*) ompi_cb_fifo_ctl_t
For shared memory communications, each on-node connection (non-self,
sender-receiver pair) gets a circular buffer during MPI_Init(). Each CB
requires the following allocations:
*) ompi_cb_fifo_wrapper_t (roughly 64 bytes)
*) ompi_cb_fifo_ctl_t head (roughly 12 bytes)
*) ompi_cb_fifo_ctl_t tail