On Wed, 10 Jun 2015, Hefty, Sean wrote:

> > There are multiple problems with libfrabric related to the use cases in m=
> y
> > area. Most of all the lack of multicast support. Then there is the build
> > up of software bloat on top. The interest here is in low latency
> > operations. Redenzvous and other new features are really not wanted if
> > they increase the latency.
>
> Multicast is only supported by one vendor that has taken a hostile position=
>  against libfabric.  Support for multicast will eventually be there, but it=
> 's definitely not a priority for me.  As an open source project, anyone is =
> welcome to propose patches.

Intel is supporting multicast in hardware. Its just a bad implementation
(broadcast and filtering MC groups in the HCA or what was that?) and there
is no plan to fix the issues despite the problem being known for quite
some time. Also does this mean that libfabric only to supports the
features needed by Intel?

> For native providers, libfabric will reduce latency.  That's a provider imp=
> lementation issue, and native providers will be available soon.  The OFIWG =
> selected to have a working set of interfaces that applications can begin us=
> ing immediately, versus waiting until there were a large set of native prov=
> iders.

I would be interested to see some measurements. AFAICT the Intel solutions
are based on historically inferior IB technology from Qlogic which has
never been able in my lab tests to compete latency wise with other
vendors. I have heard these latency claims repeatedly from Qlogic
personnel over the years.

> IMO, this is exactly the problem.  The entire design is being driving by th=
> e implementation.  That produces an unmaintainable API and fractures the so=
> ftware ecosystem, which is exactly where we are today.

This is a well designed solution and its easy to use.

It would help libfabric if you would work with other vendors and
industries to include support for their needs. MPI is not the only
applications that are running on the fabrics. I understand that is
historically the only area in which Qlogic hardware was able to compete
but I think you need to move beyond that. APIs should be as general as
possible abstracting hardware as much as possible. A viable libfabric
needs to be easy to use, low overhead as well as covering the requirements
of multiple vendors and use cases.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to