|
Hi
all, ����������� I
know I am new to this project and I must be na�ve but I want to understand few
things concerning the openib architecture. In the course of learning the openib
gen 2 stack and preparing to port the opensm to it (which is my current task),
I have encountered few areas that seem problematic to me and I would like to
understand the reasoning for it, if not to offer alternatives. I am sorry that
I rise these issues so late, but I was not involved in this project earlier. I
hope it is better late than never. ����������� It
seems to me that the major design approach is to do everything in the kernel
but let user mode staff access to the lower levels to enable performance sensitive
applications override all kernel layers. Am I right? ����������� It
seems also that within the kernel, the ib interface/verbs (ib_*) is very close to
the mthca verbs that are very close to vapi. I know that this is the way most
of the industry was working, but I wonder – is this the correct model? Will
this not pollute the kernel with a lot of IB specific stuff? Personally, I
think that IB verbs (vapi) are so complicated that another level of abstraction
is required. PDs, MRs, QPs QP state machine, PKEYs, MLIDs and other “curses”,
why should a module such as IPoIB knows about it? If the answer is performance
then I have to disagree. In the same fashion you can say that in order to
achieve efficient disk IO applications should know the disks geometry and to
able to do direct IO to the disk firmware, or that applications should talk
SCSI verbs to optimize their data transfers. ����������� It
seems to me that the current interfaces evolved to what they are today mainly
because of the way IB itself evolved – with a lot of uncertainly and a
lot of design holes (not to say “craters”). This forced most of the
industry to stick with very straight forward interfaces that were based on
Mellanox VAPI. ����������� I
wonder if this is not the right time to come up with much better abstraction –
for user mode and for kernel mode. For example, it seems that the abstraction layer
should abstract the IB networking objects and not the IB hca interface. In
other words – why not to build the abstraction around IB networking types
– UD, RC, RD, MADS? Why do we have to expose the memory management model
of the driver/HCA to upper layers? Do we really want to expose IB related
structures such as CQs, QPs, �and WQE? Why? Not only that this is not good for
abstraction meaning changes in the drivers will require upper layers modifications,
but also this is very problematic due security and stability reasons. ����������� I
think that using correct abstraction is very critical for a real acceptance in
the Linux/open source world. Good abstraction will enable us also to provide
good and secure kernel mode and user mode interfaces and access. ����������� Once
we have such interfaces, I think we should consider again the user/kernel
division. As a general rule I think that it is commonly agreed that the kernel
should include only things that must be in the kernel, meaning hardware aware software
and very performance sensitive software. Other software modules may be inserted
to the kernel once it is mature and robust. For example, RPC, NFSD and SMBFS
(SAMBA) were developed in user mode, served many years in user mode and then
after they have matured, they started to “sink” into the kernel. I
think that IB, and especially IB management modules, are far from being mature.
Even the IB standard itself is not really stable. Specifically, there is a
requirement (in the SOW) to make the IB management distributed due some scalability
and other (redundancy, etc.) requirements. I do not know if this requirement
will actually realize, but if is will, the SM and maybe also the SMI/GSI agents
and the CM will have to significantly change. If this is likely to happen, I
would suggest keeping as much as possible in user mode – it is much
easier to develop and to update. We should have kernel based agents and
mechanism to assist the performance, but I think that most of the work should
be done in user mode where it can harm less. Specifically, thinks such as MAD
transaction manager (retries, timeouts, matching), RMPP and other should be developed
in user mode and packed as libraries, again, at least until they will stabilize
and mature. Why should we develop complicated functionality such as RMPP in the
kernel when the only few kernel based queries (if any at all) will use them? ����������� If
I not mistaken, one of the IB design goals was to enable efficient *user mode* networking (copy less, low
latency). This is also the major advantage IB have over several alternatives –
most remarkably 10G Ethernet. If we will not emphasize such advantages, we will
reduce the survival chance of IB once 10GE will be widely used. If potential
users will get the impression that comparing to 10GE IB is cheaper, faster,
more efficient but requires tons of special kernel based modules and very
complicated interfaces and therefore much less stable and much more exposed to
bugs, they will use 10GE. I have no doubt. Yes, it is true that this project is
meant to supply HPC code base, but eventually, IB will not survive as HPC
interconnect only. Furthermore, all HPC applications are user mode based. Good
user mode interfaces are critical for HCP not less then to any other high end
networking applications. ����������� I
really would like to know if I am shooting in the dark or that the issues I
mentioned were discussed and there are good reasons to do them the way they
are. Or, maybe I don’t get the picture and the state of things is
completely different from what I am painting. Either way I would like to know
what do you think. Thanks, ����������� Shahar |
_______________________________________________ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
