Per e.g. the Intel Ark entry for the E5-2687W v3 
<https://ark.intel.com/products/81909/Intel-Xeon-Processor-E5-2687W-v3-25M-Cache-3_10-GHz>
 
the theoretical bandwidth of a single CPU E5-2687W v3 socket is 68GB/sec. 
And for a 2 socket system that would be 136GB/sec. Achieving that full 
theoretical bandwidth would require the right alignment of stars tho, 
including the right DIMM population, number of ranks, access pattern within 
each channel, and full use of all memory channels simultaneously.

I'd say that if you see 68GB/sec (50% of max theoretical) you are probably 
in fairly good shape.

If you want to empirically test this, I'd do something like repeated (dd 
bs=2m if=/tmp/oneGigFile of=/dev/null) or equivalent on multiple shells 
simultanuously (make a 1 gig file with mkfile -n 1g /tmp/oneGigFile), and 
grow the number of shells until you see a peak to aggregate reported 
throughput... You can obviously write a short c program to do this as 
well...

On Sunday, January 14, 2018 at 8:20:07 PM UTC-8, Peter Veentjer wrote:
>
> I discovered that the system wasn't effectively utilizing the cpu's. 
> Initially I thought it was caused by cpu throttling due to overheating, but 
> after reading out the temperatures, this hypotheses is not correct.
>
> I have increased the number of threads that generate requests. And  the a 
> saw pattern on the cpu load disappears (forking and joining) and it remains 
> constant at roughly 80%. And currently I'm at a aggregating 55 GB/s. 
>
> I also played with 50GB offheap chunk and I'm up to 68 GB/s. 
>
> It would still be interesting to know if there is a tool that can show the 
> maximum bandwidth of the memory bus.
>
> On Sunday, January 14, 2018 at 8:44:00 PM UTC+2, Peter Veentjer wrote:
>>
>> I'm working on some very simple aggregations on huge chunks of offheap 
>> memory (500GB+) for a hackaton. This is done using a very simple stride; 
>> every iteration the address increases with 20 bytes. So the prefetcher 
>> should not have any problems with it.
>>
>> According to my calculations I'm currently processing 35 GB/s. However 
>> I'm not sure if I'm close to the maximum bandwidth of this machine. Specs:
>> 2133 MHz, 24x HP 32GiB 4Rx4 PC4-2133P
>> 2x Intel(R) *Xeon*(R) CPU E5-2687W v3, 3.10GHz, 10 cores per socket
>>
>> What is the best tool to determine the maximum bandwidth of a machine 
>> running Linux (RHEL 7)
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to