Address mapping policies depend on the address type. This patch only
defines a policy for mapping link-local addresses, and we should
indeed take care not to change it (if possible).
Later on, we can add more policies for other address types (e.g.,
normal IPv6 addresses, mapped IPv4 addresses,
In this case it will not be just mapping remote addresses but creating
the required AH information which is unique to each device.
I understand that the AH is per device. What I don't get is why we would want
each device to perform the mapping. We don't expect the device to map GIDs to
LIDs
Main changes from 1.5.1:
===
1. Updated packages:
- Management
Using latest daily builds from
http://www.openfabrics.org/downloads/management/daily
- Updated libnes
libnes-1.0.1-0.1.g89ea0ee.tar.gz
- Updated libsdp
I just noticed that Vlad already opened a bugzilla bug (1874) on this.
I quote Sean's response:
RDMA CM supports UD and RC QPs (port spaces UDP/TCP) only. Support
for UC QPs should come from another port space.
This makes sense to me. Still we need to address the issues I raised
below. Sean,
OK, I'm planning on sending this upstream later today. Looks very small
and simple, and then we can figure our what if anything we want to do
for 2.6.34.
Make sense for everyone?
yes - thanks
___
ewg mailing list
ewg@lists.openfabrics.org
Sean, can you try openmpi? It fails for me, and yet ucmatose succeeds.
I don't understand the difference yet...
I believe the issue is that rdma_bind_addr succeeds (returns 0), but no device
is assigned to the rdma_cm_id (verbs field is NULL).
This was a change from commit
Also note that trying to bind rdma cm to all interface ip addresses was the way
that we were advised by openfabrics to figure out which devices are rdma-
capable.
As such, it is highly desirable to get the fix transparently in rdmacm and
preserve the old semantic. More specifically, it seems
We should work to get this 'correct' when merging upstream.
Following the spirit of the current code, it is probably cma_acquire_dev()'s
job to fill in the missing ibdev type information after matching the netdev to
an ibdev.
This makes sense to me.
P.S. - I really wish that we had a cleaner
But how can you determine _which_ rdma device should be used if and app
binds to 127.0.0.1? I think this is busted...
The code just picks the first rdma device available. To me, this is preferable
than simply disallowing the loopback device from working at all. I personally
use it all the
Well then the rdma-cm needs to know which devices support hw loopback.
Cuz on a T3-only system, no hwloop...
The problem sounds like it's more than just whether 127.0.0.1 is usable. That
check may fix openmpi, but it sounds more like the app needs to know whether the
device can actually support
This solution would work. Will you code it up?
I can do that. I just want to make sure that we address the full scope of the
problem.
- Sean
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Sean could this be a result of the new Ipv6 rdma_cm patches that were
added to OFED-1.5.1 ?
I guess it's possible. I thought DAPL only used ipv4 addresses though.
Can you try running some simpler tests, like ucmatose or rping?
on server side run: ucmatose [-b optional_local_ip_addres]
on
[wo...@det-16 ~]$ ucmatose -s 192.168.2.17
cmatose: starting client
Btw - this should be 'ucmatose -s 192.168.0.17', and needs to start
after the server is running. But, this isn't going to work since...
[wo...@det-17 src]$ ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:04:23:AF:8E:CE
+ 2. SCM - uses sockets to exchange QP information. IPoIB, ARP, and SA
queries NOT required.
This is only true if all nodes are connected with a second networking fabric.
+ 3. UCM - use's IB UD QP to exchange QP info. Sockets, ARP, IPoIB, and SA
queries NOT required.
Same as above.
@@ -306,7 +314,7 @@ static struct pingpong_context *pp_init_ctx(struct
ibv_device *ib_dev, int size,
return NULL;
}
- memset(ctx-buf, 0, size);
+ memset(ctx-buf, 0x7b + is_server, size);
ctx-context = ibv_open_device(ib_dev);
if (!ctx-context) {
@@
Sean, is the patch series (9 patches) you sent earlier still current?
Jason, do you have any updates?
Roland added at least one change to fix the build when ipv6 is not enabled. I
would check his for-next branch to see what he actually has queued.
- Sean
So what will this new librdmacm package will let cover which wasn't
possible so far? do you refer to ipv6 support in mckey? anything else?
Changes were your changes to mckey, plus changes Dave added to cmatose to
support IPv6. The actual library itself hasn't been modified.
- Sean
okay, got it. I was under the impression that mckey still misses an
option to get from the user an ipv6 multicast address which isn't all
zeros nor unmapped, correct? or the -m option will work with both ipv4
and ipv6 addresses?
The -m and -b options work with IPv4 and IPv6 addresses.
- Sean
On the topic of scalability and possible future enhancements for scalability,
one person asked for verbs extensions to allow asynchronous QP create and
modify calls
WinOF has asynchronous interfaces for modify QP, and limited testing has shown
that it can improve connection times. QP transitions
can't anyone get async modify QP today on any platform by just doing the
operation in another thread (or thread pool)? It seems that the
operations themselves are heavy enough that thread dispatch, locking etc
is going to be significant overhead.
On WinOF this is basically how things are
This patch adds ipv6 support to ucmatose.
Signed-off-by: David Wilder dwil...@us.ibm.com
Thanks.
I pulled this patch into my local git tree with just a couple of minor cleanups.
What other patches, if any, did you use to test with it?
- Sean
___
ewg
From: David J. Wilder [dwil...@us.ibm.com]
Signed-off-by: David Wilder dwil...@us.ibm.com
Signed-off-by: Sean Hefty sean.he...@intel.com
---
We need to update struct cmatest to allow storing an IPv6 address,
or we can overrun the buffer.
Running with this patch, the client causes a kernel bug
For ipv6 I ran what I described previously. What I do need to do is add
the option to rping to specify a source address and run it with various
address. Any help you can give defining what exactly needs to be tested
would be appreciated.
You can also test with ucmatose to verify ipv4 still
This patch, as Jason's suggested, moves the function of addr_resolve_local()
into addr4_resolve_remote()
and addr6_resolve_remote(). It eliminates the need for addr_resolve_local().
One quick comment, remove '_remote' from function names:
addr4_resolve_remote, addr6_resolve_remote, and
$ ip route get 10.0.0.11 from 192.168.122.1
local 10.0.0.11 from 192.168.122.1 dev lo
cache local mtu 16436 advmss 16396 hoplimit 64
(192.168.122.1 is bound to a different device on my system than
10.0.0.11)
The new case trips the if == loopback and does
rdma_translate_ip(10.0.0.11)
The
The local loopback case uses PRs?
Yes - the rdma cm makes no distinction when resolve route is called. It does a
PR query.
Even so, it still seems OK to me:
Path:
addr4_resolve_remote
$ ip route get 10.0.0.11 from 192.168.122.1
local 10.0.0.11 from 192.168.122.1 dev lo
srcIP =
That is very difficult to fit into the semantics the IP routing
model uses :( And it looks like an API problem in DAPL :(
This depends on if your view is that the rdma cm is trying to match IP routing,
trying to use IP addresses as convenient names for RDMA ports, or something in
between that may
Thanks Or. This one is already in OFED 1.4.2 but apparently this is a
different problem. Once I have information whether the patch Roland
posted fixed it I will update the list.
Eli, did you find a commit that fixes the problem you reported on?
Or.
Not yet :-(
I can't find anything off in
@@ -393,7 +393,7 @@ static int addr_resolve_local(struct sockaddr *src_in,
for_each_netdev(init_net, dev)
if (ipv6_chk_addr(init_net,
- ((struct sockaddr_in6 *) addr)-
sin6_addr,
+
+#ifdef __WIN__
+#define OSM_LOG(log, level, fmt, ...) \
+do { \
+ if (osm_log_is_active(log, (level))) \
+ osm_log(log, level, %s: fmt, __func__, ## __VA_ARGS__); \
__VA_ARGS__ should work on any platform. libibmad : mad.h uses this for windows
and linux.
+} while (0)
+#ifdef __WIN__
+#define OSM_LOG(log, level, fmt, ...) \
+do { \
+ if (osm_log_is_active(log, (level))) \
+ osm_log(log, level, %s: fmt, __func__, ## __VA_ARGS__); \
__VA_ARGS__ should work on any platform. libibmad : mad.h uses this
for windows and linux.
+} while (0)
Thanks Or. This one is already in OFED 1.4.2 but apparently this is a
different problem. Once I have information whether the patch Roland
posted fixed it I will update the list.
If ibnetdiscover doesn't use RMPP as Hal indicated, I don't think Roland's patch
will help.
ibnetdiscover D 80149b8d 0 26968 26544
(L-TLB)
8102c900bd88 0046 81037e8e 81037e8e02e8
8102c900bd78 000a 8102c5b50820 81038a929820
011837bf6105 0ede 8102c5b50a08 0001
Call Trace:
[80064207]
In the case of a failure I would like APM to move my connection from one subnet
to the other.
My question is this, assuming that there is a failure in one of the switches,
will the CM still work, or it will also fail. Obviously if it fails, I need to
find a different solution.
This should work if
Should I accept the CM to send the mads (that are used to manage the
connection) on a different subnet every time there is an APM migration?
Does this code work on Linux?
This should work on Linux. I didn't write the Windows CM, so I'm not positive
that it does this as well, but looking at the
- ctx-cq = ibv_create_cq(ctx-context, ctx-rx_depth, NULL, ctx-channel,
0);
+ ctx-cq = ibv_create_cq(ctx-context, ctx-tx_depth + ctx-rx_depth,
+ NULL, ctx-channel, 0);
I'm looking at a windows port of this test, but at least there, rx_depth is set
to rx_depth
Sure. Just above the call to ibv_create_cq(), ctx-rx_depth is set to
ctx-rx_depth = rx_depth + tx_depth
but the rest of the code does ibv_post_send() and ibv_post_recv()
based on ctx-tx_depth and ctx-rx_depth which means the CQ needs
to be ctx-tx_depth + ctx-rx_depth big.
If the tx_depth
Remember that this fix only affects the bi-directional test.
Both client and sever are going to post ctx-rx_depth receives
and ctx-tx_depth sends and then check for completions.
It won't post more sends or receives until the completions are
seen.
Okay - I think I understand what's happening.
The
Since Qlogic are using some of the APIs in these files it was decided not to
remove them in 1.5
However Qlogic were requested to approach Sean and see if they can move
their implementation to the new SA API he is developing now
Has this new SA API been proposed to the list as yet (and I
RDMA over Ethernet (RDMAoE) allows running the IB transport protocol over
Ethernet, providing IB capabilities for Ethernet fabrics. The packets are
standard Ethernet frames with an Ethertype, an IB GRH, unmodified IB transport
headers and payload. HCA RDMAoE ports are no different than regular IB
A few support functions are added to allow the translation from GID to MAC
which is required by hw drivers supporting RDMAoE.
Why not just use IP to MAC calls? Or use the MAC as the GUID?
Do the GIDs follow the IB GID format?
___
ewg mailing list
Since RDMAoE is using Ethernet there is no need for QP0. This patch will create
only QP1 for RDMAoE ports.
Which modules will use QP1 and for what purpose? I see sa_query/multicast, but
there's not an actual SA. I'm guessing that the ib_cm works without changes.
To clarify, do all IBoE packets
diff --git a/drivers/infiniband/core/multicast.c
b/drivers/infiniband/core/multicast.c
index 107f170..2417f6b 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -488,6 +488,36 @@ retest:
}
}
+struct eth_work {
+ struct work_struct
What does it mean to have a standard body associated with a
proposal? Does associated mean that the proposal/API is a published
standard in that standards body? Or some weaker definition?
There's a significant difference between an API and a wire protocol, and even
most wire protocols shouldn't
This allows to get the type of a port to be either Ethernet or IB which is
required by following patches for implementing RDMA over Ethernet - RDMAoE.
I don't know if this makes more sense without studying the changes in more
detail, but was there a reason why node_type just wasn't extended
Subject: Do you still need the local SA in OFED 1.5?
The RDMA/IB CMs do not scale without PR caching or hard-coding PR parameters.
I'm personally fine removing it from OFED. MPI and other applications are
working around SA scaling issues by connecting over sockets anyway.
...@qlogic.com
Good catch.
Acked-by: Sean Hefty sean.he...@intel.com
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
* Mellanox suggested to add IB over Eth - this is similar to iWARP but
more like IB (e.g. including UD), and can work over ConnectX.
A concern was raised by Intel (Dave Sommers) since it is not a standard
transport.
Decision: This request will be raised in the MWG, and they should decide
if OFA
Slide 18
Add WinVerbs, WinMad, and 'OFED libraries on Windows' with myself as maintainer.
A maintainer may be needed for the OFED diagnostics tools if we cannot agree to
share a common code-base.
Slide 20
Under WinVerbs, add: Supports OFED libibverbs
Slide 21
Under WinVerbs, add: Support OFED
- Question about IBCM for OFED v1.4:
- Any estimate on the following ticket:
https://bugs.openfabrics.org/show_bug.cgi?id=963
- Any estimate on the ibcm_listen() sometimes failing issue? (I
don't think there's a ticket filed against this yet)
I don't think anything was done
The requirement is mostly driven from the receiving side. For cxgb3 it
is anyway...
Maybe you can help me understand the spec here. If we ignore this feature for a
minute, then the side that calls rdma_connect() must instead issue the first
'send' request to the server. Can the first 'send' be
I've published librdmacm release 1.0.7. It contains the following additions
over 1.0.6:
Add support for rdma_migrate_id(). Kernel support will be in 2.6.25.
Several build fixes from Roland.
Set reject status correctly.
Please pull this package for OFED 1.3.1. The reject status fix has been
Signed-off-by: Sean Hefty [EMAIL PROTECTED]
---
rdma_cm_release_notes.txt | 154 +
1 files changed, 86 insertions(+), 68 deletions(-)
diff --git a/rdma_cm_release_notes.txt b/rdma_cm_release_notes.txt
index 24a07af..8060205 100644
I've pushed out new releases:
libibcm 1.0.2
librdmacm 1.0.6
to my git tree, and the OFA download pages.
Please pull both packages into OFED 1.3. Major changes from previous release:
libibcm - removes obsolete simple.c test program
librdmacm - updates to build, fix setting QP
Can you put a tag with the name ofed_1_3 on these git trees too
I have tags of v1.0.2 and v1.0.6. Can these just be used instead?
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Its just easier for us when we do diffs to have also the ofed_1_3 as all
other git trees
After some time pass its hard to remember which version was part of
which release :-(
I've added ofed_1_3 tags. Note that if there's a need to update the libraries
again, the tag will need to be deleted and
In all fairness, the kernel portion of all of this, and the process of
getting things into Linus' kernel, has *always* been a case of staging
things in Roland's tree and then merging upstream. So, at least for the
kernel, that's mostly true as OFED is pretty close to Roland's tree
generally
It's currently in, although that isn't written in stone either.
Individual changes to existing components, like xrc, can slip past me
easier than whole unsubmitted subsystems like RDS.
I think for RedHat it would end up being in the kernel, but Roland's userspace
library doesn't support it, so it
The main reason is not the bugs but the features supported by IBM - CM
support for non SRQ and 4K MTU
These are entirely my opinions, but...
OFED isn't even at RC1 if it's not at feature freeze...
OFED has moved well beyond trying to provide an enterprise distribution to
simply providing an
When we started OFED we decided to enable new features that can be in
lower stability level, in case they do not harm the overall stability of
the OFED release.
I think XRC fulfill this criteria.
XRC changes the verbs interfaces and code. It increases the risk of
instability. Changes to IPoIB
I've pushed out release 1.0.5 of the librdmacm. It adds some additional
documentation to the man pages only.
Please update OFED 1.3 to use this version.
- Sean
___
ewg mailing list
ewg@lists.openfabrics.org
but it's currently solidly part of both OFED 1.3 and 1.2.5. Should it
then be ?
IMO - any patch which has been rejected for upstream submission or is considered
experimental should be yanked from OFED.
Is there some other approach to the specific problem this patch was
attempting to fix ?
A
The patch has:
+ if (cm_id_priv-timeout_ms cm_convert_to_ms(max_timeout)) {
+ printk(KERN_WARNING PFX req timeout_ms %d %d, decreasing\n,
+ cm_id_priv-timeout_ms, cm_convert_to_ms(max_timeout));
+ cm_id_priv-timeout_ms =
11-12: SA cache session
12-1: IPoIB stateless offload issues
Sean, Roland, Dror - can you make it?
I should be able to make this, but as soon as you start pushing sessions
before noon, time should probably be made for lunch.
___
ewg mailing list
Sean - SA-caching - 45m
I think 30 minutes for this should be sufficient.
- Sean
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
1) the long time and endless threads related to the SA caching thing
need to be there. Sean - I saw that you prepare a session, correct? will
you presenting few possible designs?
I was asked to prepare a session and will mention some of the general
scalability issues that we've seen with
7) the inform info code. Sean - you have implemented and attempted to
push it through the sa caching push, but since the cache was rejected so
did the inform info code. So the questions here - how do we make this
push happen? are there any open issues, etc
There either needs to be an in
I've pushed out a release 1.0.4 of librdmacm that addresses some of the feedback
from Doug. Patches were posted previously to the list, with a small update
based on that feedback.
Please pull this release into OFED 1.3.
Changes from 1.0.3:
librdmacm/cma: provide wrapper functions to extract
Vladimir Sokolovsky wrote:
Hello,
Toward OFED-1.3 beta release we want to prepare git trees with
ofed_1_3 branch under git://git.openfabrics.org/ofed_1_3 for every
userspace package in the OFED.
All maintainers of the user space package please create ofed_1_3 branch
in your git trees if this
Provide wrapper functions to retrieve the source and destination
addresses. This is based on feedback from Doug Ledford.
Signed-off-by: Sean Hefty [EMAIL PROTECTED]
---
If there are no objections, I would like to include this change in the next
release of librdmacm, and request that it go
During failover test, we found the iscsi over iser reconnected to the
iscs target after 100 seconds due to the default max timeout (8sec) and
retry number (15). The max timeout was adjustable with the module
parameter, max_timeout, of ib_cm.ko, but the retry number wasn't. Can we
add the retry
This is the correct tree. You want the latest release, 1.0.3.
Likewise for libibcm - release 1.0.1.
_
From: Tziporet Koren [mailto:[EMAIL PROTECTED]
Sent: Monday, October 08, 2007 1:13 PM
To: Hefty, Sean; Vladimir Sokolovsky
Cc: Woodruff, Robert J; ewg@lists.openfabrics.org
Thanks - applied.
I also created a release 1.0.1 of libibcm and added it to my public html
file and downloads directory.
- Sean
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Thanks - I've added this to my patch list for the librdmacm. I plan on
releasing a new version of the librdmacm next week, pending the
acceptance of the kernel quality of service changes, which I'll ask
Roland to pull for 2.6.24 after he returns.
- Sean
If I clone from my local system over the net, I _get_ all the branches!
Anybody know why local clones on the ofa build server are not pulling
all the branches?
Maybe I'm abusing git?
It sounds like a different between git versions. Older git versions
brought in remote branches such that
http://www.openfabrics.org/downloads/mpi/mvapich (pasha)
http://www.openfabrics.org/downloads/mpi/mvapich2 (rowland)
http://www.openfabrics.org/downloads/mpi/openmpi (jsquyres)
Are all of these MPI versions distributed by OFA? If they have other
official sites, should we instead direct users
I think everyone
would be better served by a process where individual maintainers were
responsible for releasing tarballs of their packages, with schedules
coordinated toward an overall openfabrics release
For what it's worth, I agree with this approach.
- Sean
77 matches
Mail list logo