Re: [Beowulf] onload vs offload

2016-09-25 Thread Douglas Eadline

Mark,

I just wrote a white paper about this for InsideHPC and Mellanox.
I did not do any benchmarking and relied on data
from Mellano ( and the focus was the co-design concept)

You can download the white paper (probably pay with your email address)
here:
http://insidehpc.com/white-paper/insidehpc-guide-to-co-design-architecture/

Or the various sections of the paper are printed openly on the
InsideHPC site. Just search "Eadline" on InsideHPC and it will
give a list of all the articles.

Two things I find interesting are 1) the Mellanox offload
into the fabric and the trends in processor clock rate.
Faster clock rates are better for on-loading, but
as more cores are crammed onto processors the clock rates
are actually dropping, if you stay with low core counts
you can get faster cores of course.

--
Doug

> I was reviewing some rather fetid marketing collateral
> about this topic, and finding mostly stuff from 2010ish.
> A lot has changed since then: onboard PCIe, CPU speed,
> inter-socket bus, NUMA sensitivity of the kernel, lots
> more cores, mem BW, presumably smarter applications, etc.
>
> Does anyone have comments on recent generations of onload
> vs offload interconnect performance?  Please don't respond
> unless it's recent and fully quantified (HW config, how
> measured, etc).
>
> I'd also be interested to hear from MPI/app people about how useful
> offload really is (how often can real apps leverage RDMA ops,
> or the simple sorts of collectives that are offloadable?)
>
> As keeper of probably the oldest living Quadrics system, I appreciate
> the appeal of offload.  OTOH, there's no question that onloading puts
> a lot of performance potential into the CPU-designer's hands...
>
> thanks, mark hahn.
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> Mailscanner: Clean
>


--
Doug

-- 
Mailscanner: Clean

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] onload vs offload

2016-09-22 Thread Mark Hahn

I was reviewing some rather fetid marketing collateral
about this topic, and finding mostly stuff from 2010ish.
A lot has changed since then: onboard PCIe, CPU speed, 
inter-socket bus, NUMA sensitivity of the kernel, lots

more cores, mem BW, presumably smarter applications, etc.

Does anyone have comments on recent generations of onload
vs offload interconnect performance?  Please don't respond 
unless it's recent and fully quantified (HW config, how 
measured, etc).


I'd also be interested to hear from MPI/app people about how useful 
offload really is (how often can real apps leverage RDMA ops, 
or the simple sorts of collectives that are offloadable?)


As keeper of probably the oldest living Quadrics system, I appreciate
the appeal of offload.  OTOH, there's no question that onloading puts
a lot of performance potential into the CPU-designer's hands...

thanks, mark hahn.
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf