Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Sean Hefty
Please send patches that will be added to kernel_patches/fixes. Please update your git tree from git://git.openfabrics.org/~vlad/ofed_1_2/.git ofed_1_2 You want me to create a patch that adds a file that contains the actual patches? Why not apply the patches directly?

Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Sean Hefty
Yes, actual patches should be created under kernel_patches/fixes. Please update your git tree because the following patch fails: Can you explain how the patch fails? I don't see how putting the patch into a file helps. Why not apply the patches directly? To be consistent with 2.6.20 kernel.

Re: [openib-general] [RFC] [PATCH] ib_cache: do not mask upper bit when searching for a pkey

2007-02-27 Thread Sean Hefty
Sorry for jumping into that thread, but although this patch will make things more spec compliant, it will break functionality we depend one. I suggest that we first find an alternate way to enable usage of partial partition membership before disabling that functionality at all. Can you

Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Sean Hefty
I think with stacked git or just git and rebasing at key times, you could keep an ofed_1_2 tree that folks can easily apply patches to... Its too late to change this for 1.2, but you might want to reconsider the design for 1.3. Can't we just create a new branch (ofed_1_2_patched) with these

Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Sean Hefty
But you cannot keep a stack for more than one backport pushed, right? So you still need to be slapping the stacks of patches around for each backport. Why not have separate branches for each kernels too? ___ openib-general mailing list

Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Sean Hefty
I think it'll be much more work to maintain all these branches. And again, there will be conflicts, and it's too easy to get confused when resolving a conflict. Storing patches in a directory seems confusing to me. They must be applied in a specific order for everything to work, and that

Re: [openib-general] [PATCH] for OFED 1.2

2007-02-26 Thread Sean Hefty
Vladimir Sokolovsky wrote: On Fri, 2007-02-23 at 12:15 -0800, Sean Hefty wrote: I would like these fixes in OFED 1.2 as well. What git tree / branch do I generate a patch against? - Sean git://git.openfabrics.org/~vlad/ofed_1_2/.git branch: ofed_1_2 Can you try pulling from

Re: [openib-general] [PATCH] IB/core: Set static rate in ib_init_ah_from_path()

2007-02-26 Thread Sean Hefty
int ib_init_ah_from_path(struct ib_device *device, u8 port_num, struct ib_sa_path_rec *rec, struct ib_ah_attr *ah_attr) { int ret; u16 gid_index; memset(ah_attr, 0, sizeof *ah_attr); ah_attr-dlid = be16_to_cpu(rec-dlid);

[openib-general] [RFC] [PATCH] ib_cache: do not mask upper bit when searching for a pkey

2007-02-26 Thread Sean Hefty
communication between all members. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- drivers/infiniband/core/cache.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c index 558c9a0..6f366c3 100644 --- a/drivers

[openib-general] [PATCH] for OFED 1.2

2007-02-23 Thread Sean Hefty
in ib_init_ah_from_wc correctly. The patches are in: git://git.openfabrics.org/~shefty/rdma-dev.git for-roland (sign-off line was added to the actual commit messages) Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- commit 28e218621d36cf9da42f07af08775769eb289fc0 Author: Sean Hefty [EMAIL PROTECTED

Re: [openib-general] [PATCH] librdmacm: fix bug causing failure to work with partial membership pkey

2007-02-22 Thread Sean Hefty
My understanding is that when an IPoIB broadcast domain contains both partial and full members (*) attempts to communicate between two partial members would silently fail, does this silence is something you think we should work to change? I'm looking at this from a different view than just ipoib

Re: [openib-general] [PATCH] librdmacm: fix bug causing failure to work with partial membership pkey

2007-02-22 Thread Sean Hefty
An IB multicast group _cannot_ have partial members so this never should get far enough to where two limited members would be unable to communicate. Can someone help my understanding here? Is ipoib joining a multicast group using the full membership PKey, even if the node that it joins from only

Re: [openib-general] [PATCH] librdmacm: fix bug causing failure to work with partial membership pkey

2007-02-22 Thread Sean Hefty
Can someone help my understanding here? Is ipoib joining a multicast group using the full membership PKey, even if the node that it joins from only has the limited membership PKey configured? And the code in ib_find_cached_pkey helps enable this? Yep. The ipoib create_child function Or-s

[openib-general] ipoib the partial pkey, was: librdmacm: fix bug causing failure to work with partial membership pkey

2007-02-22 Thread Sean Hefty
Doesn't this allow ipoib to join a multicast group for which it may not be able to communicate with all members? For the broadcast group, this seems like an error to me. Can ipoib work in such a configuration? If all nodes were assigned a partial membership PKey, none of them could communicate,

[openib-general] [PATCH] 2.6.21-rc1: please pull rdma-dev.git for-roland

2007-02-22 Thread Sean Hefty
are in git.openfabrics.org/~shefty/rdma-dev.git, for-roland branch, which is based on 2.6.21-rc1. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- commit 28e218621d36cf9da42f07af08775769eb289fc0 Author: Sean Hefty [EMAIL PROTECTED] Date: Thu Feb 22 11:37:44 2007 -0800 rdma_cm: remove unused node_guid from

Re: [openib-general] [PATCH] 2.6.21-rc1: please pull rdma-dev.git for-roland

2007-02-22 Thread Sean Hefty
the patches are in git://git.openfabrics.org/~shefty/rdma-dev.git for-roland I will do that in the future. And yes, the sign off line was just a mistake. Thanks for fixing that. - Sean ___ openib-general mailing list openib-general@openib.org

Re: [openib-general] IB routing discussion summary

2007-02-21 Thread Sean Hefty
I sent a message on this topic to the IBTA several days ago, but I am still awaiting details (likely early next week). It should not be carried in the CM REQ. The SLID / DLID of the router ports should be derived through local subnet SA / SM query. When a CM REQ traverses one or more subnets

Re: [openib-general] [PATCH] librdmacm: fix bug causing failure to work with partial membership pkey

2007-02-21 Thread Sean Hefty
There is no problem. As i have explained over this thread the ipoib and the core abstract away from the user the actual value of the MSb of the pkey, that is whether it is a full or partial membership pkey. But *why* does the kernel code do this, and should it? - Sean

Re: [openib-general] [PATCH] librdmacm: fix bug causing failure to work with partial membership pkey

2007-02-21 Thread Sean Hefty
It does this since its makes life simple and robust. Is an SM prevented from loading two PKeys into an HCA's PKey table that differ by only the membership bit? I can't think of any reason to do such a thing, but depending on which index was selected could limit which nodes you could

Re: [openib-general] GetTable path record query not returningDGID=SGID paths

2007-02-21 Thread Sean Hefty
We haven't looked into this in more detail yet. This was our observation while testing on a larger (64 node) cluster this morning that we don't have access to at the moment. With the local SA cache running, we were surprised to see any retries, and when we looked into it more, retries were

Re: [openib-general] OFA 1.2 tarball creation

2007-02-19 Thread Sean Hefty
How exactly is various developers' source code pulled together to create the nightly OFA tarballs at www.openfabrics.org/builds (could this be put on the wiki somewhere?)? I went looking to see if some of Sean's work on RDMA CM had made it into these tarballs, and am not seeing code with the

Re: [openib-general] krping.c changes

2007-02-16 Thread Sean Hefty
Brett McMillian wrote: I wasn't sure who I should email about this, but I recently got krping to work between an Opteron and a PPC G5. However, in order for krping to work I had to make the following changes to krping.c to ensure the address, key, and length were being sent across the

Re: [openib-general] SA multicast patches

2007-02-16 Thread Sean Hefty
Well the consumer has to know what P_Key to use since it must match the QP that will be used to send/receive. So I would suggest not trying to guess in the low-level multicast.c code, and rely on the consumer to set it properly. I'm fine leaving it at 0. For now, I think the safest thing to do

Re: [openib-general] SA multicast patches

2007-02-16 Thread Sean Hefty
Roland Dreier wrote: OK, another question about the multicast.c code: +static struct mcast_group *mcast_find(struct mcast_port *port, +union ib_gid *mgid) +{ + struct rb_node *node = port-table.rb_node; + struct mcast_group *group; + int ret;

Re: [openib-general] SA multicast patches

2007-02-16 Thread Sean Hefty
For now, I think the safest thing to do is just remove the entire 'else' portion from the function and return an error if the MGID is 0. Neither of the places that call into ib_sa_get_mcmember_rec() should pass in an MGID of 0. (I'm testing this now to verify.) I'm not sure if you'll need this,

Re: [openib-general] SA multicast patches

2007-02-16 Thread Sean Hefty
Or is it that you want to be able to iterate through the whole rbtree and get the MGID 0 groups too? This is it - see mcast_groups_lost(). That call transitions all multicast groups into an error state, and reports to the user that the group information may have been lost by the SA. (We

Re: [openib-general] please pull for 2.6.21: fix + add IB multicast support

2007-02-16 Thread Sean Hefty
Roland Dreier wrote: OK, I pulled this in to my for-2.6.21 branch and I will ask Linus to pull later today. Thanks for the review. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To

Re: [openib-general] [PATCH] IB/core: Set static rate in ib_init_ah_from_path()

2007-02-16 Thread Sean Hefty
Guys, any reason not to merge this? It's step one of the cleanups from Jason's patch to make IPoIB work with global routes... I would like to see this merged in. - Sean ___ openib-general mailing list openib-general@openib.org

Re: [openib-general] IB routing discussion summary

2007-02-15 Thread Sean Hefty
Ideas were presented around trying to construct an 'inter-subnet path record' that contained the following: - Side A GRH.SGID = active side's Port GID - Side A GRH.DGID = passive side's Port GID - Side A LRH.SLID = any active side's port LID - Side A LRH.DLID = A subnet router

Re: [openib-general] IB routing discussion summary

2007-02-15 Thread Sean Hefty
Is this first an IBTA problem to solve if you believe there is a problem? Based on my interpretation, I do not believe that there's an error in the architecture. It seems consistent. Additional clarification of what PathRecord fields mean when the GIDs are on different subnets may be

Re: [openib-general] Problem is routing CM REQ

2007-02-14 Thread Sean Hefty
Assume that the active and passive sides of a connection request are on different subnets and: Active side - LID 1 Active side router - LID 2 Passive side - LID 93 Passive side router - LID 94 What values are you suggesting are used for: Active side QP - DLID Passive side QP - DLID CM REQ

Re: [openib-general] IB routing discussion summary

2007-02-14 Thread Sean Hefty
Mike, are you expecting that routers will modify CM messages as they flow between subnets? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit

Re: [openib-general] Problem is routing CM REQ

2007-02-14 Thread Sean Hefty
I agree with what was in your response, however, this is how I interpret your answers: Active side QP - DLID 2 Passive side QP - DLID 94 CM REQ Primary Local Port LID no answer given - CM creates a REQ and populates the global information to identify the remote endnode. The LRH

Re: [openib-general] GetTable path record query not returningDGID=SGID paths

2007-02-14 Thread Sean Hefty
What is the value of NumbPath and how large a subnet is this ? I'm pretty sure this works; at least it did the last I checked. By default, NumbPath should be 127, but I would have expected a path record even with it set to 1. (I don't think we were using different PKeys or anything like that.)

Re: [openib-general] Problem is routing CM REQ

2007-02-13 Thread Sean Hefty
It does not need to comprehend the remote subnet(s) LID. That is the router protocol to determine. CM also must understand the GIDs involved which the router will process to figure out its LID mapping to the next hop. The CM REQ carries the remote router LID (primary local port lid -

[openib-general] IB routing discussion summary

2007-02-13 Thread Sean Hefty
Here's a first take at summarizing the IB routing discussion. The following spec references are noted: 9.6.1.5 C9-54. The SLID shall be validated (for connected QPs). 12.7.11. CM REQ Local Port LID - is LID of remote router. 13.5.4: Defines reversible paths. The main discussion point centered

Re: [openib-general] Problem is routing CM REQ

2007-02-13 Thread Sean Hefty
A LID is subnet local on that we can all agree. The CM Req contains either the LID of a local subnet CA or the LID a local router which will move the packet to the next hop to the destination. 12.7.11 is basically saying that the remote LID is the router's LID of the local subnet's router

Re: [openib-general] Problem is routing CM REQ

2007-02-12 Thread Sean Hefty
Ah, I think I missed the key step in your scheme.. You plan to query the local SM for SGID=remote DGID=local? (ie reversed from 'normal'. I was thinking only about the SGID=local DGID=remote query direction) I'm not sure that the query needs the GIDs reversed, as long as the path is

Re: [openib-general] Problem is routing CM REQ

2007-02-12 Thread Sean Hefty
1) What does the TClass and FlowLabel returned from SGID=local DGID=remote mean? Do you use it in the Node1 - Node2 direction or the Node2 - Node1 direction or both? Maybe it would help if we can agree on a set of expectations. These are what I am thinking: 1. An SA should be

Re: [openib-general] Problem is routing CM REQ

2007-02-12 Thread Sean Hefty
An endnode look up should be to find the address vector to the remote. A look up may return multiple vectors. The SLID would correspond to each local subnet router port that acts as a first-hop destination to the remote subnet.I don't see why the router protocol would not simply

Re: [openib-general] Problem is routing CM REQ

2007-02-12 Thread Sean Hefty
4. A PR from the local SA with reversible=1 indicates that data sent from the remote GID to the local GID using the PR TC and FL will route locally using the specified LID pair. This holds whether the PR SGID is local or remote. 5. A PR from a remote SA with reversible=1 indicates that data

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
I have a follow up question to this.. With CM how is the SL for each side determined? I'm looking through the code here and it looks like the SL of the active side is passed in the REQ to the passive side (ie both sides are the same) But cma_query_ib_route does not set the reversible bit when

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
SLID corresponding to SGID and a DLID for some IB router on the subnet which can route to the remote DGID. This was my assumption as well. An SM is free to choose SLID and DLID to supply to if there are multiple LIDs for the ports in question it can choose alternates. The key here is

Re: [openib-general] please pull for 2.6.21: fix + add IB multicast support

2007-02-09 Thread Sean Hefty
+ member = kzalloc(sizeof *member, gfp_mask); + if (!member) + return ERR_PTR(-ENOMEM); This appears okay to replace with kmalloc. + group = kzalloc(sizeof *group, gfp_mask); + if (!group) + return NULL; + We would need additional

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
the /missing part (right now) is locating the SA on that remote subnet if this is a needed function. Maybe we can expose this to SA clients through a ServiceRecord? This doesn't solve how the two SAs find each other (or any of the other difficult stuff), but with this and the path record

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
Sean: Even if you can query both SA's there isn't enough information to force things to use the same router path in each direction. My assumption is that the remote SA contains the necessary information about how a packet coming from the local SGID to the remote DGID would be routed on the

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
The hard part is the global distribution of this information. The best idea I can come up with for locating remote SAs is to have the SAs assign themselves a specific Unicast Global GID Assigned Value. So, each SA gives themselves a GID similar to: 64-bit subnet prefix :: 1. Hosts on remote

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
So basically what you are saying is that the TClass and FlowLabel act as some kind of global dis-ambiguation that lets all SAs know that the tuple SGID,DGID,TClass,FlowLabel MUST be matched with LRH_A,LRH_B on each side. Sort of... My reasoning is that if you look at a packet traveling from the

Re: [openib-general] Problem is routing CM REQ was: Use a GRH when appropriate for unicast packets

2007-02-08 Thread Sean Hefty
Hum, you mean to meet the LID validation rules of 9.6.1.5? That is a huge PITA.. [IMHO, 9.6.1.5 C9-54 is a mistake, if there is a GRH then the LRH.SLID should not be validated against the QP context since it makes it extra hard for multipath routing and QoS to work...] Yes - this gets messy.

Re: [openib-general] Problem is routing CM REQ was: Use a GRH when appropriate for unicast packets

2007-02-08 Thread Sean Hefty
This requires that the passive side be able to issue path record queries, but I think that it could work for static routes. A point was made to me that the remote side could be a TCA without query capabilities. Are you referring to SA query capabilities ? Would such a device just be expected

Re: [openib-general] Problem is routing CM REQ

2007-02-08 Thread Sean Hefty
Looking at the problem more, I think that the issue extends to the remote port LID as well. My expectation with a local path record query is that the SLID is the local port, and the DLID is the local router. This should be sufficient for one-way UD traffic, but for connected traffic

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Sean Hefty
Oops, I'll fix these style things and send a new patch. Jason, what's the status of this patch? (I ask because I'm starting to look at router support in the stack.) - Sean ___ openib-general mailing list openib-general@openib.org

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Sean Hefty
I didn't get too far on getting CMA to work. Beyond the bad HopLimit feild I was seeing Hal pointed out a number of problems in IBA that would prevent it from working as is : I've started thinking about what it would take to get the rdma cm to work across a router. I think the rdma cm may

Re: [openib-general] RFC ofed 1 2 kernel file structure

2007-02-07 Thread Sean Hefty
Michael S. Tsirkin wrote: Repost. Could everyone please look at git://git.openfabrics.org/~mst/newofed.git and tell me whether this looks acceptable? I don't see anything listed for this off of the web site, and cloning it produces an empty tree. - Sean

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Sean Hefty
Basically, if IB routers are used, and the IPoIB feature of *not* spanning a subnet is used (for scalabililty?) then you need an alternate way to specify addresses to rdma cm. This was the case I was thinking of. Without global IB name service resolution, how do you get the GID of the

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Sean Hefty
I don't think that is the main problem - though clearly the way things are now (for better or worse) rdma cm requires the IPoIB subnet to span all of the IB subnets.. The main problem with the protocol is in the LID selection for routed paths on the passive side. It can't rely on the active

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Sean Hefty
We don't want to have the routers snoop and alter CM GMPs. agreed The passive side cannot use information from the LRH to get the router LID since the LRH may not be reversible. argh... I was interpreting symmetric paths at the network layer (SGID to DGID) and applying it at the link layer

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Sean Hefty
If name service resolution gives me an IPv6 address that's off of the local subnet, but the ARP response gives me an address that's on the local subnet, then I think we can assume that ARP was unsuccessful is resolving the address to the remote GID. (I.e. the GID should be for a router.) If

Re: [openib-general] [RFC] [PATCH] ib_usa: export multicast and informinfo registration to userspace

2007-02-06 Thread Sean Hefty
+static int process_mcast(struct usa_file *file, struct ib_usa_request *req, + int out_len) +{ + /* Only indirect requests are currently supported. */ + if (!req-local) + return -ENOSYS; + + switch (req-method) { + case IB_MGMT_METHOD_GET: +

Re: [openib-general] [PATCH] [RFC] ofed_1_2 - SLES9SP3 Backport - IWCM workaround for ip_dev_find() bug.

2007-02-06 Thread Sean Hefty
Steve Wise wrote: I propose the following fix for supporting iWARP on SLES9SP3. This fixes bug 325. Sean, can you please review this? The changes seem fine with me. Does this bug affect the ib_addr module as well? (addr_resolve_local and rdma_translate_ip) - Sean

Re: [openib-general] [PATCH] [RFC] ofed_1_2 - SLES9SP3 Backport - IWCM workaround for ip_dev_find() bug.

2007-02-06 Thread Sean Hefty
Actually, yes it does. Here's one case (that I just tested :): If you rdma_bind() to an explicit address local address, it will fail. Foo! I guess I'll need to address the uses of ip_dev_find() in addr.c as well before we commit this. Can we just backport our own version of

Re: [openib-general] please pull for 2.6.21: fix + add IB multicast support

2007-02-06 Thread Sean Hefty
Can you comment on the multicast changes merge for 2.6.21 status? Where are the final patches that you want to merge? Try the for-roland branch at git.openfabrics.org/~shefty/scm/rdma-dev.git. If this doesn't work, or you hit any snags, let me know, and I'll try to correct any issues so that

Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-05 Thread Sean Hefty
The name is ib_mcast_wq which is too long for older kernels. Did we loose a backport patch? Not sure what happened here. Sean, could you rename ib_mcast_wq to ib_mcast please? I renamed the workqueue for what I requested to pull upstream, and I added a patch to my pull request to rename a

Re: [openib-general] please pull for 2.6.21: fix + add IB multicast support

2007-02-02 Thread Sean Hefty
Sean Hefty (3): rdma_cm: Increment port number after close to avoid re-use. ib_sa: track multicast join/leave requests rdma_cm: add multicast communication support Assuming that you haven't look at this yet, I updated the ib_sa patch above to shorten the workqueue name

[openib-general] [RFC] [PATCH] ib_usa: export multicast and informinfo registration to userspace

2007-02-02 Thread Sean Hefty
the usermad interface. The user to kernel interface is minimal, but was designed to be flexible enough to add additional SA client support if needed. (E.g. local SA cache lookup, SA queries, service registration, etc.) Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- The following patch is also

Re: [openib-general] new IB CM reject reason

2007-02-01 Thread Sean Hefty
No, I don't think application crashed makes sense as an element of wire protocol. I think an optional logging of errors in kernel CM would be a much better solution. I know I had to add some printks it each time I was debugging SDP. The application crashed scenario is what high-lighted the

Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-01 Thread Sean Hefty
- Sean, please base your branches on specific -rc from linus (OFED 1.2 is now -rc7). My branches should be in sync with rc6. The original branches were built from an earlier rc version, and updated by pulling in the latest rc from Linus through my master branch. Are you wanting the

Re: [openib-general] new IB CM reject reason

2007-02-01 Thread Sean Hefty
Would you be interested in a patch making it possible to enable logging CM errors and/or all CM events? A patch for this would be fine with me. Are we talking about code 28? My spec lists it as consumer reject. The meaning of *private data* is consumer defined. The consumer

Re: [openib-general] new IB CM reject reason

2007-02-01 Thread Sean Hefty
And my claim is that you should define private data format to go with this other reason otherwise you are not really solving the problem. This is not a consumer issued reject. It is a CM issued reject, so the private data is ignored. This is no different than several other reject reasons

Re: [openib-general] [RFC][PATCH] rdma_cm: allow joins to return a unique address

2007-01-31 Thread Sean Hefty
I understand that your approach relies on the uniqueness of the MGID being generated. This means that to have different MPI jobs use different MGIDs , the MGIDs must be generated --always-- on the same NODE and be propagated to other nodes/ranks participating in that MPI job - correct? MGID

[openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Sean Hefty
need to reload that on my system), but did test this fix by forcing the abi to version 3 with a newer kernel loaded. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- diff --git a/src/cma.c b/src/cma.c index 2d2a587..c5f8cd9 100644 --- a/src/cma.c +++ b/src/cma.c @@ -653,11 +653,49 @@ static int

Re: [openib-general] ip_ib_mc_map?

2007-01-31 Thread Sean Hefty
where can I find this symbol? I can't load rdma_cm on rhel4u4... rdma_cm: Unknown symbol ip_ib_mc_map This is in include/net/ip.h for current systems. It is part of ipoib support. - Sean ___ openib-general mailing list openib-general@openib.org

Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Sean Hefty
Steve Wise wrote: Should this be a problem for OFED 1.2? I would think the ABI for all backports should be the same, so it wouldn't be a problem. Is this true? I'm assuming all backported UCMA modules would have the same ABI. This is a problem for anyone that tries to use a newer version

Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Sean Hefty
Fixed it for IB maybe, but not for iWarp, right? It should be fixed for both. So OFED 1.2 will be ABI 3, right? OFED will be ABI 4, since it will include multicast support (which is what causes the ABI to bump from 3 to 4). - Sean ___

[openib-general] new IB CM reject reason

2007-01-31 Thread Sean Hefty
We've hit into an issue with the IB CM reject reason codes. When a remote application crashes during connection establishment, the connection will be rejected by the kernel CM. Unfortunately, there's not a decent reject reason that maps to this event. Currently, the ib_cm issues the reject as

Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Sean Hefty
But there still exists an iwarp issue that I need to fix because librdmacm (the one shipped in OFED) now calls the kernel rdma_init_qp_attr() function via ucma before the library calls kernel rdma_connect() via ucma... Can you clarify which versions of the librdmacm and kernel you are using? The

Re: [openib-general] new IB CM reject reason

2007-01-31 Thread Sean Hefty
Is there a reason to distinquish between a connection that is being rejected because the listener crashed and a connection that is being rejected because the listener does not exist? This only covers the case for the REQ received state, and could work for that state. But the problem can also

Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Sean Hefty
OFED will be ABI 4, since it will include multicast support (which is what causes the ABI to bump from 3 to 4). Has the ofed tree been updated to ABI 4 yet? I just looked in vlad's git tree a while ago, and his ofed_1_2 branch had ABI 3. His ofed_1_2_multicast didn't have an rdma_user_cm.h

Re: [openib-general] new IB CM reject reason

2007-01-31 Thread Sean Hefty
So would that would mean that only an InfiniBand specific wire-protocol code was needed, and that no API enhancement was required? Yes - I'm talking about the IB CM wire-protocol specifically. Actual implementation changes would likewise be limited to the ib_cm. Trying to describe failures in a

Re: [openib-general] ip_ib_mc_map?

2007-01-31 Thread Sean Hefty
Steve Wise wrote: Perhaps there's no backport for this to rhel4u4? I would have thought so, but I really don't know. The function is called from net/ipv4/arp.c, and not directly by ipoib. So, I don't know how the backport patches typically handle this. - Sean

[openib-general] please pull for 2.6.21: fix + add IB multicast support

2007-01-30 Thread Sean Hefty
Roland, I've created a 'for-roland' branch off of my git tree: git://git.openfabrics.org/~shefty/rdma-dev.git with the following changes: Sean Hefty (3): rdma_cm: Increment port number after close to avoid re-use. ib_sa: track multicast join/leave requests rdma_cm: add

Re: [openib-general] [RFC][PATCH] rdma_cm: allow joins to return a unique address

2007-01-30 Thread Sean Hefty
Excellent -- is this in a git tree somewhere that I can grab (I'm new to git)? Or, what would be an appropriate tree to apply this to? This is now available from my rdma-dev.git tree on openfabrics. The patch is included in the multicast and ofed_1_2 branches. - Sean

Re: [openib-general] OFED 1.2 release - to be reviewed in the meeting today

2007-01-30 Thread Sean Hefty
*Sources developed in OFA:* 1. Each git owner will open a branch with the name ofed_1_2. This branch should be opened on 31-Jan (based on code readiness we will review today). I've added ofed_1_2 branches to my libibcm.git, librdmacm.git, and rdma-dev.git trees. - Sean

Re: [openib-general] [RFC][PATCH] rdma_cm: allow joins to return a unique address

2007-01-30 Thread Sean Hefty
Excellent -- is this in a git tree somewhere that I can grab (I'm new to git)? Or, what would be an appropriate tree to apply this to? I've committed changes to the librdmacm multicast test program (mckey) that provides an example of using this functionality. The changes are in the

Re: [openib-general] [RFC] Performance Manager

2007-01-29 Thread Sean Hefty
Initially ? It is also an implementation phasing issue as stated. The core support is needed in both so there is very little unneeded work to get to the first phase in terms of a distributed approach. We would certainly grow/evolve towards this after that initial implementation. Based on what

[openib-general] [PATCH] ib_sa/multicast: Fix crash when multiple HCAs are present

2007-01-29 Thread Sean Hefty
We need to use a per device event handler, rather than a single, global handler that gets reinitialized when a new device is added to the system. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c index fde977e

Re: [openib-general] oops at device removal

2007-01-29 Thread Sean Hefty
@@ -71,6 +70,7 @@ struct mcast_device { int start_port; int end_port; struct mcast_port port[0]; + struct ib_event_handler event_handler; }; The mcast_port data is allocated at the end of the structure. event_handler

Re: [openib-general] CM callbacks

2007-01-29 Thread Sean Hefty
Eric Barton wrote: Is the following possible? 1. I listen for connection requests. 2. RDMA_CM_EVENT_CONNECT_REQUEST is delivered, I rdma_accept() successfully and return from the callback. 3. RDMA_CM_EVENT_DISCONNECTED is delivered. Am I wrong to assume I can only get

[openib-general] [RFC][PATCH] rdma_cm: allow joins to return a unique address

2007-01-29 Thread Sean Hefty
if additional join requests are for a specific MGID, or require IP to MGID mapping. This is done by comparing the requested join address against SA assigned MGIDs. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 827df2a

Re: [openib-general] [RFC][PATCH] rdma_cm: allow joins to return a unique address

2007-01-29 Thread Sean Hefty
To allow others to join this group, we need a way to determine if additional join requests are for a specific MGID, or require IP to MGID mapping. This is done by comparing the requested join address against SA assigned MGIDs. Still not understanding this part -- this means that I'm not able

Re: [openib-general] [RFC] [PATCH 2/2] for 2.6.21/OFED1.2 rdma_cm: add multicast support

2007-01-27 Thread Sean Hefty
Sean, were you able to try this with an iWARP device to check for regressions? No - I don't have any iWarp devices available to me. I thought about possible regressions too, since this changes QP initialization, and is why I listed this patch in the series as an RFC. Hopefully by getting this

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-01-26 Thread Sean Hefty
Maybe the best fix is to have ib_init_ah_from_path() itself print a warning if the GID index can't be found, just set the gid_index to 0 in that case, and change ib_init_ah_from_path() to return void? What do you think of doing something like this for 2.6.21: Changing this to a void seems fine

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-01-26 Thread Sean Hefty
And even if that can occur, can the problem be pushed off until the user calls ib_create_ah? Not sure what you mean by this. ib_init_ah_from_path() is only used to initialize the ah_attr before calling ib_create_ah(). We have to trap for failure from ib_create_ah(), so if

Re: [openib-general] [RFC] Performance Manager

2007-01-26 Thread Sean Hefty
There are numerous PerfManager models which can be supported: 1. Integrated as thread(s) with OpenSM (run only when SM is master) 2. Standby SM 3. Standalone PerfManager (not running with master or standby SM) 4. Distributed PerfManager (most scalable approach) IMO, we will eventually need

[openib-general] [RFC] [PATCH 0/2] for 2.6.21/OFED1.2: add IB multicast support

2007-01-26 Thread Sean Hefty
for a release of the librdmacm. (There are 1.0 and 1.1 branches there already, which are left from the initial svn conversion to git.) Signed-off-by: Sean Hefty [EMAIL PROTECTED] ___ openib-general mailing list openib-general@openib.org http://openib.org

[openib-general] [PATCH 1/2] for 2.6.21: ib_sa/ib_ipoib: add IB multicast support

2007-01-26 Thread Sean Hefty
requests, and modify ib_ipoib to use the new module. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Except for the previous bug fix to prevent kernel crashes, I don't believe that this patch has changed. And depending on what happens with ib_init_ah_from_path, we will likely want to have

[openib-general] [RFC] [PATCH 2/2] for 2.6.21/OFED1.2 rdma_cm: add multicast support

2007-01-26 Thread Sean Hefty
. This requires saving the qkey to use when attaching to a device, so that it is available when creating the QP. The qkey information is exported to the user through the existing rdma_init_qp_attr() routine. Multicast support is exported to userspace through the rdma_ucm. Signed-off-by: Sean Hefty [EMAIL

Re: [openib-general] RDMA CM multicast

2007-01-26 Thread Sean Hefty
I don't this isn't as easy as you've made it sound. I see two approaches to preventing address collision -- both require voluntary participation. First is a centralized authority approach (this has been used for IP multicast-based protocols). This means running some sort of daemon in a location

Re: [openib-general] [PATCH 1/2] rdma_cm: add support to join IPOIB multicast groups

2007-01-25 Thread Sean Hefty
this means that basically (*) you have my OK for pushing the mutlicast support to OFED 1.2 (again my thinking is that this is fine for upstream as well). I've pushed these changes out to my rdma-dev.git tree. The only missing piece here, as we agreed yesterday is to allow using PS_IPOIB IDs for

[openib-general] librdmacm code confusion wrt iWarp

2007-01-25 Thread Sean Hefty
Steve, I'm looking at rdma_create_qp() in librdmacm. There's a section of code in there: if (id-ps == RDMA_PS_UDP) ret = ucma_init_ud_qp(id_priv, qp); else ret = ucma_init_ib_qp(id_priv, qp); Both of these calls transition the QP to INIT, so that the user can post receives

  1   2   3   4   5   6   7   8   9   10   >