Re: How to handle illegal multicast addresses in IPoIB?

2009-09-21 Thread Jason Gunthorpe
On Mon, Sep 21, 2009 at 05:39:11PM +0300, Moni Shoua wrote: So there is only one question left. Do we also need the other patch? Yes, we need something like the other patch to handle the case where the SM is unwilling to create a group for some reason. Right now that is exactly the same failure

Re: [ofa-general] Re: [GIT PULL] please pull ummunotify

2009-09-28 Thread Jason Gunthorpe
On Mon, Sep 28, 2009 at 10:49:23PM +0200, Pavel Machek wrote: I don't remember seeing discussion of this on lkml. Yes it is in -next... eg http://lkml.org/lkml/2009/7/31/197 and followups, or search for v2 and earlier patches. Well... it seems little overspecialized. Just

Re: ib_types.h moving [was: Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm]

2009-10-01 Thread Jason Gunthorpe
On Thu, Oct 01, 2009 at 03:50:07PM -0700, Ira Weiny wrote: On Wed, 30 Sep 2009 20:57:52 -0600 Jason Gunthorpe jguntho...@obsidianresearch.com wrote: On Wed, Sep 30, 2009 at 06:31:26PM -0700, Ira Weiny wrote: Now I likely would agree with Ira that moving ib_types.h to libibumad

Re: [PATCH 1/2] rdma/cm: support option to allow manually setting IB path

2009-10-05 Thread Jason Gunthorpe
On Mon, Oct 05, 2009 at 10:43:44AM -0700, Sean Hefty wrote: Export rdma_set_ib_paths to user space to allow applications to manually set the IB path used for connections. This allows alternative ways for a user space application or library to obtain path record information, including

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-06 Thread Jason Gunthorpe
On Tue, Oct 06, 2009 at 12:05:02PM -0700, Sean Hefty wrote: From user space, the call sequence does not change. The user calls rdma_resolve_addr, rdma_resolve_route, rdma_connect, etc. It is up to the librdmacm to perform the resolution. Today, the resolution request is simply passed down

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-06 Thread Jason Gunthorpe
On Tue, Oct 06, 2009 at 03:53:21PM -0700, Sean Hefty wrote: Actually, thinking about it some more, that would be very helpful. As I said before, I have worked on apps using IB CM. The only reason is to have complete control over the addressing. If I could use RDMA CM API in some kind of AF_GID

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-06 Thread Jason Gunthorpe
On Tue, Oct 06, 2009 at 06:20:05PM -0700, Sean Hefty wrote: There are 3 interfaces of interest here. The librdmacm API, the rdma_ucm user to kernel interface, and the rdma_cm interface. These patches are looking to change the rdma_ucm interface. I want to avoid changing the API or behavior

Re: [PATCH 1/2] rdma/cm: support option to allow manually setting IB path

2009-10-06 Thread Jason Gunthorpe
On Tue, Oct 06, 2009 at 12:05:02PM -0700, Sean Hefty wrote: Ideally the best approach would be to have a mux at the ib_mad level. We could allow a user space application to intercept all outbound MADs for a given class and/or attribute. Unlike the present snooping of mads, this would

Re: [ofa-general] Re: [GIT PULL] please pull ummunotify

2009-10-12 Thread Jason Gunthorpe
On Mon, Oct 12, 2009 at 08:19:44PM +0200, Ingo Molnar wrote: After that point the scheme is perfectly lossless. Well if it can OOM it's not lossless, obviously. You just define event loss to be equivalent to Destruction of the universe. ;-) It can't OOM once the ummunotify registration is

Re: [ewg] rping is not resolving ipv6 addresses

2009-10-12 Thread Jason Gunthorpe
On Mon, Oct 12, 2009 at 10:52:59AM -0700, David J. Wilder wrote: It is not, IPv6 link local addresses must be scoped. rping is parsing the address with getaddrinfo, that does correctly set the sin6_scope_id value in the sockaddr. ping6 is scoping the address (setting sin6_scope_id) by

Re: [ofa-general] Re: [GIT PULL] please pull ummunotify

2009-10-12 Thread Jason Gunthorpe
On Mon, Oct 12, 2009 at 10:20:46PM +0200, Ingo Molnar wrote: It might be more acceptable because the flag-hint mechanism can at most cause over-flushing - while with perf events we might miss to invalidate a range altogether. Right. Overflushing is not important, but missing an event

Re: [ofw] [PATCH] opensm - standardize on a single Windows #define

2009-10-12 Thread Jason Gunthorpe
On Mon, Oct 12, 2009 at 03:29:38PM -0700, Smith, Stan wrote: If __linux__ doesn't work for you, then please create a Linux Platform define I can use. Pretty much all the patches I've seen you make should be guarded by __WIN__, you shouldn't be using __linux__. opensm and the other OFA stuff

Re: [ofa-general] Re: [GIT PULL] please pull ummunotify

2009-10-13 Thread Jason Gunthorpe
On Tue, Oct 13, 2009 at 08:40:06AM +0200, Ingo Molnar wrote: * Jason Gunthorpe jguntho...@obsidianresearch.com wrote: On Mon, Oct 12, 2009 at 10:20:46PM +0200, Ingo Molnar wrote: It might be more acceptable because the flag-hint mechanism can at most cause over-flushing - while

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-13 Thread Jason Gunthorpe
On Tue, Oct 13, 2009 at 03:09:40PM -0700, David J. Wilder wrote: Here is a patch to addr6_resolve_remote() to correctly handle link-local address. It should cover all the conditions Jason described. Looks pretty good to me, definitely on the right track. Hmm.. Actually, upon comparing to

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-14 Thread Jason Gunthorpe
On Wed, Oct 14, 2009 at 10:30:05AM -0700, David J. Wilder wrote: This looks good. Once concern, it may be obtuse, but if both the src and dst are link-local addresses should only one need to be scoped? This patch will required the src to always be scoped when using link local. The TCPv6

[PATCH libibcm] Return errors from the kernel consistently

2009-10-19 Thread Jason Gunthorpe
with that. Codes should have been be positive for alignment with POSIX, but it is much too late for that.. Signed-off-by: Jason Gunthorpe jguntho...@obsidianresearch.com --- src/cm.c | 30 +++--- 1 files changed, 15 insertions(+), 15 deletions(-) librdmacm has the same

Re: [PATCH libibcm] Return errors from the kernel consistently

2009-10-20 Thread Jason Gunthorpe
to return errno than to fixup the cases that don't, so lets stick with that. Codes should have been be positive for alignment with POSIX, but it is much too late for that.. Signed-off-by: Jason Gunthorpe jguntho...@obsidianresearch.com src/cm.c | 30 +++--- 1 files

Re: [PATCH 1/2] rdma/cm: support option to allow manually setting IB path

2009-10-20 Thread Jason Gunthorpe
On Fri, Oct 09, 2009 at 02:48:03PM -0700, Sean Hefty wrote: Before spending any more time on this patch series, is there any disagreement to accepting this patch (as is or slightly modified) upstream? Can you please have some way for this to pass APM data and the reversible GMP path as well?

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-20 Thread Jason Gunthorpe
On Tue, Oct 20, 2009 at 11:08:58AM -0700, Sean Hefty wrote: Looking on kernel cma.c format_hdr code it first branches on the address family and next of the port space. Going with your proposed flow, I understand that an app call to rdma_resolve_addr will be broken down to rdma_bind_addr, ACM

[PATCH libibrdmacm] Return errors from the library consistently

2009-10-20 Thread Jason Gunthorpe
return 0. Signed-off-by: Jason Gunthorpe jguntho...@obsidianresearch.com --- examples/rping.c | 22 ++--- man/rdma_cm.7| 16 src/cma.c| 94 - 3 files changed, 68 insertions(+), 64 deletions(-) I think

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-20 Thread Jason Gunthorpe
On Tue, Oct 20, 2009 at 01:05:15PM -0700, Sean Hefty wrote: I agree. But we can still support AF_IB/PS_TCP by simply assigning the service ID correctly. rdma_bind_addr only needs to fill in a service id if one is not given. This should enable 'one is not given' == 0 service ID? yes

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-20 Thread Jason Gunthorpe
On Tue, Oct 20, 2009 at 01:48:34PM -0700, Sean Hefty wrote: Private data: - AF_IB/PS_TCP - the kernel munges the private data to be compatible with AF_INET/PS_TCP, but otherwise is the same. - AF_IB/PS_IB - the kernel doesn't touch the private data. I was thinking AF_IB/* - kernel doesn't

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-21 Thread Jason Gunthorpe
On Wed, Oct 21, 2009 at 05:40:30PM -0700, Sean Hefty wrote: Even so, it still seems OK to me: Path: addr4_resolve_remote $ ip route get 10.0.0.11 from 192.168.122.1 local 10.0.0.11 from 192.168.122.1 dev lo srcIP = 192.168.122.1 rdma_translate_ip(dst_ip = 10.0.0.11)

Re: [PATCH 1/2] rdma/cm: support option to allow manually setting IB path

2009-10-21 Thread Jason Gunthorpe
On Wed, Oct 21, 2009 at 06:07:54PM -0700, Sean Hefty wrote: I'm reluctant to override fields like this to save 4 bytes. The clarity and extensibility of using an additional flags field seems worth it to me, and the processing code is not complex. I cannot think of a motivation to save the 4

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-22 Thread Jason Gunthorpe
On Wed, Oct 21, 2009 at 11:49:52PM -0700, Sean Hefty wrote: This is actually something of a mandatory notion to implement the full generality of the IB CM protocol which allows the CM REP to contain a port GUID of another port on the same node (multi-port APM is an IB feature). So

Re: [PATCH v2] [RFC] rdma/cm: support option to allow manually setting IB path

2009-10-22 Thread Jason Gunthorpe
On Thu, Oct 22, 2009 at 01:10:09AM -0700, Sean Hefty wrote: +static int ucma_set_ib_path(struct ucma_context *ctx, + struct ib_path_rec_data *path_data, size_t optlen) +{ + struct ib_sa_path_rec sa_path; + struct rdma_cm_event event; + int ret; + +

Re: [PATCH v2] [RFC] rdma/cm: support option to allow manually setting IB path

2009-10-22 Thread Jason Gunthorpe
On Thu, Oct 22, 2009 at 10:52:13AM -0700, Sean Hefty wrote: +static int ucma_set_ib_path(struct ucma_context *ctx, + struct ib_path_rec_data *path_data, size_t optlen) +{ + struct ib_sa_path_rec sa_path; + struct rdma_cm_event event; + int ret; + + if

Re: [PATCH v2] [RFC] rdma/cm: support option to allow manually setting IB path

2009-10-22 Thread Jason Gunthorpe
On Thu, Oct 22, 2009 at 11:28:11AM -0700, Sean Hefty wrote: I don't like the idea of the kernel silently ignoring the alternate path. Returning an error seems like a better idea. Then provide a way for userspace to know WTF to do. Without a negotiation process this is now an 'impossible to use

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-25 Thread Jason Gunthorpe
On Sun, Oct 25, 2009 at 01:25:21PM +0200, Or Gerlitz wrote: well, you didn't address some of my comments (not the ice-cream ones...), which come to say that this wouldn't be inter-operable if for one side you convert INET/TCP to IB/IB and for the other side you don't (e.g userA/userB

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-25 Thread Jason Gunthorpe
On Sun, Oct 25, 2009 at 01:52:13PM +0200, Or Gerlitz wrote: Jason, Have you even looked into or tested any of the bonding load-balancing modes with ipoib? some/most of them are not applicable to IPoIB and I don't think that the ones which may be such were ever tested. I was saying that

[PATCH RDMA] Fixup IPv6 support and IPv4 routing corner cases for RDMA CM

2009-10-27 Thread Jason Gunthorpe
- Fold addr_send_arp into addr_resolve so that it uses the correct dst structure. Based on work from David J. Wilder dwil...@us.ibm.com Signed-off-by: Jason Gunthorpe jguntho...@obsidianresearch.com Reported-by: David J. Wilder dwil...@us.ibm.com Reported-by: leo.tomi...@oracle.com --- drivers

Re: [PATCH RDMA] Fixup IPv6 support and IPv4 routing corner cases for RDMA CM

2009-10-28 Thread Jason Gunthorpe
On Wed, Oct 28, 2009 at 10:05:19AM -0700, Sean Hefty wrote: Can you explain how rdma_resolve_addr is used in conjunction with multicast? I do not understand what the dest would be. Is it just a man page typo? A UD endpoint can communicate using multicast and to other UD endpoints. A user

Re: [PATCH v3] [RFC] rdma/cm: support option to allow manually setting IB path

2009-10-28 Thread Jason Gunthorpe
on the state of the rdma cm id. The librdmacm already invokes this after rdma_resolve_addr completes. Great, I didn't realize that was there. No further comments from me then Reviewed-By: Jason Gunthorpe jguntho...@obsidianresearch.com Jason -- To unsubscribe from this list: send the line unsubscribe

[PATCH libibverbs] Make ibv_get_device_list return codes via errno

2009-10-28 Thread Jason Gunthorpe
(probably ESPIPE). Signed-off-by: Jason Gunthorpe jguntho...@obsidianresearch.com --- examples/asyncwatch.c |2 +- examples/device_list.c|2 +- examples/devinfo.c|4 ++-- examples/rc_pingpong.c|2 +- examples/srq_pingpong.c |2 +- examples/uc_pingpong.c

Re: [PATCH] librdmacm/mckey: enforce local binding for unmapped multicast addresses

2009-11-02 Thread Jason Gunthorpe
9f3a76deb5bfda0f8243eadfa024eb547c03f583 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe jguntho...@obsidianresearch.com Date: Mon, 2 Nov 2009 11:23:38 -0700 Subject: [PATCH] RDMA CM: Fix AF_INET6 support in multicast joining If joining to an AF_INET6 address we need to map the address to a MGID

Re: [PATCH] librdmacm/mckey: enforce local binding for unmapped multicast addresses

2009-11-03 Thread Jason Gunthorpe
On Tue, Nov 03, 2009 at 08:43:01AM -0800, Sean Hefty wrote: What's missing is Jason's patch to fix the IPv6 mapping, and a way to extend the rdma_cm to support the full range of unmapped addresses. I just haven't been able to get to either of these yet. My feeling is when AF_IB is

[PATCH] RDMA CM: Correct detection of SA Created MGID

2009-11-03 Thread Jason Gunthorpe
RDMA CM treats AF_INET6 addresses that are either 0 or prefixed with FF1x:A01B::/32 as MGIDs, but the detection for the prefix was buggy, fix it up. Signed-off-by: Jason Gunthorpe jguntho...@obsidianresearch.com --- drivers/infiniband/core/cma.c |2 +- 1 files changed, 1 insertions(+), 1

Re: QoS in local SA entity

2009-11-08 Thread Jason Gunthorpe
On Sun, Nov 08, 2009 at 08:25:55AM +0200, Or Gerlitz wrote: ACM is intended to be a service that's used by the librdmacm to resolve address mappings and routes. Trying to have ACM use the librdmacm ends up with a circular dependency. That's the part I'm trying to avoid. fail-enough, I

Re: QoS in local SA entity

2009-11-09 Thread Jason Gunthorpe
On Mon, Nov 09, 2009 at 09:44:31AM +0200, Or Gerlitz wrote: No rdma_resolve_addr2 is needed the one that exists now has source addresses specified, I don't see that extra info is needed for AF_INET that was resolved with rdma_getaddrinfo is this AF_IB specific? The extra info in

Re: [PATCH RDMA] Fixup IPv6 support and IPv4 routing corner cases for RDMA CM

2009-11-10 Thread Jason Gunthorpe
On Tue, Nov 10, 2009 at 01:12:17PM -0800, Sean Hefty wrote: How can we make progress? What are your thoughts? I'm actually working on this today. I've taken Jason's patch as a starting point, and I'm breaking it into separate patches for merging and testing. I hope to post the patches

Re: strong ordering for data registered memory

2009-11-11 Thread Jason Gunthorpe
On Wed, Nov 11, 2009 at 05:44:59PM -0500, Richard Frank wrote: Would anyone like to through out the list of HCAs that do this... I can guess at a few... and can ask the vendors directly.. if not.. . It would be much nicer to not hardcode names of adapters.. but that won't stop us.. :)

Re: [PATCH] IB/core: export struct ib_port

2009-11-11 Thread Jason Gunthorpe
On Wed, Nov 11, 2009 at 03:22:50PM -0800, Ralph Campbell wrote: While this is true for SLtoVL, we create other files which are device specific under the port directory too. It seems like we might need to introduce a callback into the driver to create the port specific sysfs files. Maybe give

Re: ib_post_send in drivers

2009-11-21 Thread Jason Gunthorpe
On Sat, Nov 21, 2009 at 12:17:32PM +0100, Bart Van Assche wrote: ib_post_send() has to request a completion notification for each WR, which has a negative performance impact. My opinion is that the current behavior makes ib_post_send() easier to implement, while the behavior specified in the

Re: ib_post_send in drivers

2009-11-21 Thread Jason Gunthorpe
On Sat, Nov 21, 2009 at 09:37:29PM +0100, Bart Van Assche wrote: It would be more useful IMHO if ib_post_send() would not post any WR instead of posting part of the WR's passed to ib_post_send() when the 'queue full' condition is hit. Note: SRPT uses a single SRQ per HCA, even when multiple

Re: [PATCH 9/9] ib/addr: fix ipv6 routing lookup

2009-11-23 Thread Jason Gunthorpe
On Mon, Nov 16, 2009 at 04:12:07PM -0800, Sean Hefty wrote: +static int cma_check_linklocal(struct rdma_dev_addr *dev_addr, +struct sockaddr *addr) +{ + struct sockaddr_in6 *sin6; + + if (addr-sa_family != AF_INET6) + return 0; + + sin6

Re: [PATCH 9/9] ib/addr: fix ipv6 routing lookup

2009-11-23 Thread Jason Gunthorpe
On Mon, Nov 23, 2009 at 02:07:09PM -0800, Sean Hefty wrote: The main missing test from the new version is: if (dev_addr-bound_dev_if dev_addr-bound_dev_if != addr6-sin6_scope_id) return -EINVAL; Done on the 2nd src and dest

Re: [PATCH 9/9] ib/addr: fix ipv6 routing lookup

2009-11-23 Thread Jason Gunthorpe
On Mon, Nov 23, 2009 at 07:43:34PM -0800, Sean Hefty wrote: The sin6_scope_id must be ignored in all cases except LL addresses. Fixing this will break the path that sets the bound_dev_if starting from rdma_resolve_addr. This is why my version was setting bound_dev_if directly in

Re: RDMAoE verbs questions

2009-11-24 Thread Jason Gunthorpe
On Tue, Nov 24, 2009 at 06:23:15PM -0500, Jeff Squyres wrote: 2. I am somewhat confused by the overloading of the term transport. It appears that a device will have ibv_device.transport_type==IBV_TRANSPORT_IB for both IB and RDMAOE devices. The only way to tell the difference is to

Re: RDMAoE verbs questions

2009-11-24 Thread Jason Gunthorpe
On Tue, Nov 24, 2009 at 09:12:53PM -0500, Jeff Squyres wrote: On Nov 24, 2009, at 7:11 PM, Jason Gunthorpe wrote: Is the same true for openmpi? If you try to run it as is on a RDMAOE interface will it work? If not I think that alone should kill this idea.. OMPI uses RDMACM (among

Re: RDMAoE verbs questions

2009-11-25 Thread Jason Gunthorpe
On Wed, Nov 25, 2009 at 04:41:08PM +0200, Eli Cohen wrote: On Wed, Nov 25, 2009 at 09:30:40AM -0500, Jeff Squyres wrote: In practice, we have seen that applications *do* need to query the transport type -- at least (real) IB vs. iWARP. It is your expectation that IB and IBoE will

Re: RDMAoE verbs questions

2009-11-30 Thread Jason Gunthorpe
On Mon, Nov 30, 2009 at 03:34:06PM +0200, Eli Cohen wrote: If we change struct ibv_port_attr transport field from enum to uint8, we eliminate binary compatibility problems. That's because the previous filed is aligned to 16 bits address so that leaves us 16 bits more. Dealing with ABI

Re: RDMAoE verbs questions

2009-11-30 Thread Jason Gunthorpe
On Mon, Nov 30, 2009 at 10:50:02AM -0800, Roland Dreier wrote: If we change struct ibv_port_attr transport field from enum to uint8, we eliminate binary compatibility problems. That's because the previous filed is aligned to 16 bits address so that leaves us 16 bits more. diff

Re: RDMAoE verbs questions

2009-12-01 Thread Jason Gunthorpe
On Tue, Dec 01, 2009 at 06:22:06PM +0200, Liran Liss wrote: Dealing with ABI compatability is a different issue, this new scheme is API incompatible due to the change in semantics for existing values. For rdmacm applications, there are no semantic changes between IB and RDMAoE. So? There

Re: RDMAoE verbs questions

2009-12-04 Thread Jason Gunthorpe
On Fri, Dec 04, 2009 at 08:03:31PM -0800, Roland Dreier wrote: then I think legacy apps should be OK (port_attr size doesn't change, binary compat is still there), and new apps that do check link_layer should also be OK ... if they use an old library and/or old driver, they'll see

Re: RDMAoE verbs questions

2009-12-09 Thread Jason Gunthorpe
On Wed, Dec 09, 2009 at 11:06:41AM -0800, Roland Dreier wrote: It looks good to me. Thanks, I will take it for RDMAoE. Great... as Jason suggested, please also add in the appropriate reserved fields to pad the struct to a 32 bit boundary and zero them in the wrapper. So if there is a

Re: [PATCH] rdmaoe/libibverbs: handle binary compatibility

2009-12-10 Thread Jason Gunthorpe
On Thu, Dec 10, 2009 at 07:05:36PM +0200, Eli Cohen wrote: here is the patch I prepared based on the discussions we had. The patch is based on the last rdmaoe/libibverbs patch I sent. libmlx4 was modified too, a trivial change that changes name. Both fixes were push to OFED. I will send a

Re: [ewg] Re: [PATCH] rdmaoe/libibverbs: handle binary compatibility

2009-12-10 Thread Jason Gunthorpe
On Thu, Dec 10, 2009 at 11:14:55PM +0200, Eli Cohen wrote: On Thu, Dec 10, 2009 at 10:33:53AM -0700, Jason Gunthorpe wrote: Could you prepare this based on Roland's tree? This patch won't apply. I quote two patches, one for libibverbs based on 74638ac, and the other for libmlx4 based

Re: [ewg] Re: [PATCH] rdmaoe/libibverbs: handle binary compatibility

2009-12-10 Thread Jason Gunthorpe
On Thu, Dec 10, 2009 at 01:57:11PM -0800, Roland Dreier wrote: Maybe I'm wrong but I don't like setting don't know magically to IB behind the scenes. Well, it isn't just don't know it also means the kernel doesn't support the link_layer query. The kernels that don't support link_layer also

Re: SRP issues with OpenSM 3.3.3

2009-12-15 Thread Jason Gunthorpe
On Tue, Dec 15, 2009 at 09:18:19AM -0800, Ira Weiny wrote: On Tue, 15 Dec 2009 10:15:32 -0700 Jason Gunthorpe jguntho...@obsidianresearch.com wrote: However, I don't understand the comment Only set HopLimit if going through a router? This is from '#ifdef ROUTER_EXP' days

Re: [Announce] rxe dev tree available (soft RDMAoE)

2009-12-16 Thread Jason Gunthorpe
On Wed, Dec 16, 2009 at 03:06:37PM -0600, frank zago wrote: Hello, The development tree for a soft RDMA transport over Ethernet driver (rxe) is available in the OFA git repository. This is a work in progress but has enough functionality for people interested in looking at it to be

Re: [Announce] rxe dev tree available (soft RDMAoE)

2009-12-17 Thread Jason Gunthorpe
On Thu, Dec 17, 2009 at 03:43:56PM -0600, Robert Pearson wrote: I agree about the polynomial. That would have been nice. As currently proposed the embedding of the IB transport in Ethernet frames preserves the IB ICRC as part of the transport and trades the VCRC for Ethernet's CRC32 over the

Re: [Announce] rxe dev tree available (soft RDMAoE)

2009-12-17 Thread Jason Gunthorpe
On Thu, Dec 17, 2009 at 05:00:26PM -0600, Robert Pearson wrote: Of course we need to interoperate with ConnectX-en and it, as far as I know, only knows how to compute the VCRC as though the packet was going to get sent to IB including a phantom 12 byte LRH that is filled with 1's without any

Re: ipoib 10ge gateway

2010-01-05 Thread Jason Gunthorpe
On Tue, Jan 05, 2010 at 07:27:57PM -0500, Aaron Knister wrote: As an aside I would love to know how to pull an infiniband interface into a bridge. bridging only works between networks with the same L2 address scheme and support in the bridging driver for that address scheme. The vnic schemes

Re: [PATCHv7 4/9] ib_core: RoCEE CMA device binding

2010-01-07 Thread Jason Gunthorpe
On Thu, Jan 07, 2010 at 08:50:47AM -0800, Sean Hefty wrote: +route-path_rec-hop_limit = 2; The reason is that ib_init_ah_from_path() will not set IB_AH_GRH for hop_limit smaller then 2, and since that GRH is required in RoCEE, and since this is specific to RoCEE, I put 2 to make

Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-12 Thread Jason Gunthorpe
On Tue, Jan 12, 2010 at 02:57:35PM -0800, Roland Dreier wrote: I doubt you could benchmark the overhead of calling ib_post_recv() in the full SRP protocol. Really, I bet it's less than 100 nanoseconds to form the work request and call ib_post_recv(). Maybe I'm wrong but I really expect the

Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-13 Thread Jason Gunthorpe
On Wed, Jan 13, 2010 at 08:23:27AM +0100, Bart Van Assche wrote: On Wed, Jan 13, 2010 at 12:24 AM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: [ ... ] Also, I couldn't tell for sure from a cursory examination of the patch, but the initiator must be designed to not stall

Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-13 Thread Jason Gunthorpe
On Wed, Jan 13, 2010 at 07:57:26PM +0100, Bart Van Assche wrote: On Wed, Jan 13, 2010 at 7:16 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: On Wed, Jan 13, 2010 at 08:23:27AM +0100, Bart Van Assche wrote: On Wed, Jan 13, 2010 at 12:24 AM, Jason Gunthorpe jguntho

Re: mode connected infiniband

2010-01-20 Thread Jason Gunthorpe
On Tue, Jan 19, 2010 at 05:09:49PM -0800, Roland Dreier wrote: The last time I tried to use it the kernel began reporting lots of OOM events (2.6.30 stock). I thought this was well known because CM mode uses high order allocations?? That's not well-known to me. What's the backtrace

Re: opensm/complib: redundant redeclaration of functions

2010-02-01 Thread Jason Gunthorpe
On Mon, Feb 01, 2010 at 03:05:13PM +0200, Yevgeny Kliteynik wrote: In general, here's the problem: We have cl_file.h and cl_file_osd.h. cl_file.h has include directive for cl_file_osd.h cl_file.h has the following definition of function: int foo(); cl_file_osd.h has another

Re: opensm/complib: redundant redeclaration of functions

2010-02-02 Thread Jason Gunthorpe
On Tue, Feb 02, 2010 at 03:11:26PM +0200, Yevgeny Kliteynik wrote: If no inline version is defined then the compiler just emits a normal function call, if an inline version is defined then the compiler might use it. Thanks for the idea. I read some documentation about it, and it does look

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Jason Gunthorpe
On Fri, Feb 05, 2010 at 12:32:51PM -0600, Steve Wise wrote: I think we should remove the feature of allowing binds to 127.0.0.1 altogether based on Jeff's arguments and my assertion that 127.0.0.1 is a sw-loopback mechanism anyway... I don't agree, the kernel should be free to provide a

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Jason Gunthorpe
On Fri, Feb 05, 2010 at 03:08:10PM -0500, Jeff Squyres wrote: On Feb 5, 2010, at 1:56 PM, Jason Gunthorpe wrote: I think we should remove the feature of allowing binds to 127.0.0.1 altogether based on Jeff's arguments and my assertion that 127.0.0.1 is a sw-loopback mechanism anyway

Re: [ewg] rdma/cm: revert associating an RDMA device when binding to loopback

2010-02-09 Thread Jason Gunthorpe
On Tue, Feb 09, 2010 at 05:01:21PM -0500, Jeff Squyres wrote: 1. Is this now the recommended way to find all the IP interfaces that support RDMA: - loop over all local IP addresses - if 127.0.0.1/8, skip - try to rdma_bind_addr() - if it succeeds and verbs ptr is != NULL, it's an RDMA

Re: Problem with XRC userspace

2010-02-15 Thread Jason Gunthorpe
On Mon, Feb 15, 2010 at 02:56:56PM +0200, Jack Morgenstein wrote: If I put the XRC srq fields after the pthread_cond_t, the RHEL4/5 incompatibility kicks in, in a big way. If I put them before the pthread_cond_t, we still have a problem with events_completed, as indicated by Gleb Nabokov of

Re: is it possible to avoid syncing after an rdma write?

2010-02-16 Thread Jason Gunthorpe
On Tue, Feb 16, 2010 at 03:29:48PM -0800, Andy Grover wrote: Right now, RDS follows each RDMA write op with a Send op, which 1) causes an interrupt and 2) includes the info we need to call ib_dma_sync_sg_for_cpu() for the target of the rdma write. We want to omit the Send. If we don't do the

Re: is it possible to avoid syncing after an rdma write?

2010-02-17 Thread Jason Gunthorpe
On Tue, Feb 16, 2010 at 10:40:45PM -0800, Paul Grun wrote: Two advantages come to mind vs an RDMA Write followed by a SEND: Using a SEND will consume a second WQE on the send side, and the synchronizing SEND will cause an entire new transaction, which will consume a(n infinitesimally) small

Re: [net-next-2.6 PATCH] infiniband: convert to use netdev_for_each_mc_addr

2010-02-27 Thread Jason Gunthorpe
On Sat, Feb 27, 2010 at 11:38:37AM +0100, Jiri Pirko wrote: The problem this statement is trying to solve had to do with bonding creating multicast addresess for ethernet rather than infiniband in some cases. This happens because bonding makes a device that switches from ethernet to

Re: adding path record wire format to libibverbs

2010-03-18 Thread Jason Gunthorpe
On Wed, Mar 17, 2010 at 10:48:30AM -0700, Sean Hefty wrote: Would you accept a patch to libibverbs sa.h to add the wire format for a path record, shown below? I'm not ready to submit a patch yet, just need to find a place for this. Could we use the bitfield versions of this I posted awhile

Re: adding path record wire format to libibverbs

2010-03-18 Thread Jason Gunthorpe
On Thu, Mar 18, 2010 at 10:49:01AM -0700, Roland Dreier wrote: Including this structure is no problem in principle. Could we use the bitfield versions of this I posted awhile back? :) I don't have that patch handy, but bitfields often end up ugly with endian stuff, and also some

Re: [net-next-2.6 PATCH] ipoib: remove addrlen check for mc addresses

2010-03-22 Thread Jason Gunthorpe
On Mon, Mar 22, 2010 at 02:21:39PM +0100, Jiri Pirko wrote: Finally this bit can be removed. Currently, after the bonding driver is changed/fixed (32a806c194ea112cfab00f558482dd97bee5e44e net-next-2.6), that's not possible for an addr with different length than dev-addr_len to be present in

Re: [net-next-2.6 PATCH] ipoib: remove addrlen check for mc addresses

2010-03-22 Thread Jason Gunthorpe
On Mon, Mar 22, 2010 at 06:26:14PM +0100, Jiri Pirko wrote: Mon, Mar 22, 2010 at 05:59:16PM CET, jguntho...@obsidianresearch.com wrote: On Mon, Mar 22, 2010 at 02:21:39PM +0100, Jiri Pirko wrote: Finally this bit can be removed. Currently, after the bonding driver is changed/fixed

Re: Ummunotify: progress at last!

2010-03-23 Thread Jason Gunthorpe
On Tue, Mar 23, 2010 at 12:06:50PM -0400, Jeff Squyres wrote: IBM has found a resource that they think will be able to progress Roland's ummunotify work. After a few discussions in Sonoma last week and some off-list emails, here's what we decided: 1. Take Roland's last code drop

Re: Ummunotify: progress at last!

2010-03-23 Thread Jason Gunthorpe
On Tue, Mar 23, 2010 at 01:17:40PM -0400, Jeff Squyres wrote: On Mar 23, 2010, at 12:59 PM, Jason Gunthorpe wrote: The main reason for the new FD is so it can be polled on.. What do you poll on the fd for? With ummunotify, you only read() from the fd when (counter != last_counter

Re: Ummunotify: progress at last!

2010-03-23 Thread Jason Gunthorpe
On Tue, Mar 23, 2010 at 03:17:12PM -0400, Jeff Squyres wrote: If you don't think that is worth doing it does simplify things alot, just add two new verbs calls: ibv_set_mmu_counter(verbs, my_counter); ibv_get_mmu_notifications(verbs, my_list, sizeof(my_list)); I have no real opinion

Re: Ummunotify: progress at last!

2010-03-23 Thread Jason Gunthorpe
On Tue, Mar 23, 2010 at 04:01:21PM -0400, Jeff Squyres wrote: On Mar 23, 2010, at 3:52 PM, Jason Gunthorpe wrote: ibv_set_mmu_counter(verbs, my_counter); ibv_get_mmu_notifications(verbs, my_list, sizeof(my_list)); These are not hiding mmap/read, they are new uverbs 'syscalls

Re: Ummunotify: progress at last!

2010-03-23 Thread Jason Gunthorpe
On Tue, Mar 23, 2010 at 10:55:08PM -0700, Roland Dreier wrote: I would prefer to do this by adding a new verbs call that returns a fd directly. Ie use ib_uverbs_alloc_event_file and act like ibv_create_comp_channel. The main reason for the new FD is so it can be polled on..

Re: Ummunotify: progress at last!

2010-03-24 Thread Jason Gunthorpe
On Tue, Mar 23, 2010 at 10:59:42PM -0700, Roland Dreier wrote: That is all definitely doable. I wonder if it's better to get rid of the dedicated fd though. After all, having the fd means a fancy app can do poll() or sigio or whatever internally. Being able to integrate into an fd-driven

Re: [RFC] [PATCH 5/22 v2] [for 2.6.36] rdma/cm: update port reservation to support AF_IB

2010-03-25 Thread Jason Gunthorpe
On Thu, Mar 25, 2010 at 12:05:33PM -0700, Sean Hefty wrote: + case AF_IB: + ((struct sockaddr_ib *) addr)-sib_sid = + cpu_to_be64(((u64) id_priv-id.ps 16) + ntohs(port)); + break; + } Could you elaborate a bit on how you are mixing the

Re: [RFC] [PATCH 5/22 v2] [for 2.6.36] rdma/cm: update port reservation to support AF_IB

2010-03-25 Thread Jason Gunthorpe
On Thu, Mar 25, 2010 at 08:56:23PM -0700, Sean Hefty wrote: Is there any reason the port space has to be known when the cm_id is created but before bind? No - but it is still required for transport neutrality. rdma_create_id() simply stores the value. To correct this slightly, a user can

Re: [RFC] [PATCH 5/22 v2] [for 2.6.36] rdma/cm: update port reservation to support AF_IB

2010-03-26 Thread Jason Gunthorpe
On Fri, Mar 26, 2010 at 10:17:13AM -0700, Sean Hefty wrote: Consider UDP, an application can create an id and join multicast groups using AF_INET, AF_INET6, and/or AF_IB - whatever addressing scheme is most convenient. Actually, the port space doesn't matter when specifying a multicast

Re: [RFC] [PATCH 5/22 v2] [for 2.6.36] rdma/cm: update port reservation to support AF_IB

2010-03-26 Thread Jason Gunthorpe
On Fri, Mar 26, 2010 at 05:11:04PM -0700, Sean Hefty wrote: Actually, the port space doesn't matter when specifying a multicast group. So, I can't think of any real advantage to supporting AF_IB/PS_TCP. Okay - I found one advantage - fewer changes. :b Lol! I've started adding

Re: nfsrdma broken on 2.6.34-rc1?

2010-03-29 Thread Jason Gunthorpe
On Mon, Mar 29, 2010 at 12:01:07PM -0700, Roland Dreier wrote: The rdma_cm might be able to support this if the port space were separated based on the address family, depending on how PS IB ends up. I think separate port spaces is the correct solution. This gets a bit tricky --

Re: nfsrdma broken on 2.6.34-rc1?

2010-03-29 Thread Jason Gunthorpe
On Mon, Mar 29, 2010 at 02:51:46PM -0500, Steve Wise wrote: Yeah, exactly, it is very complex and there is a real need for things pretending to be IP to capture all this subtlety. The details can't just be skipped over, people will notice :( Though, I'm also not entirely certain that

Re: [PATCH 9/26] rdma/cm: set qkey for port space IB

2010-04-01 Thread Jason Gunthorpe
On Thu, Apr 01, 2010 at 10:08:41AM -0700, Sean Hefty wrote: Include RDMA_PS_IB checks when setting the qkey for UD QPs. I know we talked about this, but seeing this patch makes me ask again, should the QKEY be part of sockaddr_ib? Or at least be settable somehow? Jason -- To unsubscribe from

Re: [PATCH 9/26] rdma/cm: set qkey for port space IB

2010-04-01 Thread Jason Gunthorpe
On Thu, Apr 01, 2010 at 11:00:44AM -0700, Sean Hefty wrote: I know we talked about this, but seeing this patch makes me ask again, should the QKEY be part of sockaddr_ib? Or at least be settable somehow? I think so. I'm not sure of the best approach. With these patches, the qkey seems to

Re: [PATCH 26/26] rdma/ucm: allow user space to specify AF_IB when joining multicast

2010-04-01 Thread Jason Gunthorpe
On Thu, Apr 01, 2010 at 10:46:06AM -0700, Sean Hefty wrote: } else if ((addr-sa_family == AF_INET6)) { ipv6_ib_mc_map(sin6-sin6_addr, dev_addr-broadcast, mc_map); - if (id_priv-id.ps == RDMA_PS_UDP) + if (id_priv-id.ps == RDMA_PS_UDP || id_priv-id.ps ==

Re: [PATCH 26/37] librdmacm: set src_addr in rdma_getaddrinfo

2010-04-07 Thread Jason Gunthorpe
On Wed, Apr 07, 2010 at 10:12:43AM -0700, Sean Hefty wrote: RDMA requires the user to allocate hardware resources before establishing a connection. To support this, the user must know the source address that the connection will use to reach the remote endpoint. Modify rdma_getaddrinfo to

Re: [PATCH 22/37] librdmacm: add new call to create id

2010-04-07 Thread Jason Gunthorpe
On Wed, Apr 07, 2010 at 10:12:44AM -0700, Sean Hefty wrote: + * The rdma_cm_id will be set to use synchronous operations (connect, + * listen, and get_request). To convert to synchronous operation, the ^ asynchronous? Jason -- To

Re: Ummunotify: progress at last!

2010-04-07 Thread Jason Gunthorpe
On Wed, Apr 07, 2010 at 12:37:03PM -0700, Roland Dreier wrote: No, there is no mmap. Like this: u64 my_counter = 0; ibv_set_mmu_counter(verbs, my_counter); [..] while (my_counter != last_my_counter) { last_my_counter = my_counter;

Re: [PATCH 26/37] librdmacm: set src_addr in rdma_getaddrinfo

2010-04-07 Thread Jason Gunthorpe
On Wed, Apr 07, 2010 at 12:54:56PM -0700, Sean Hefty wrote: I haven't looked through everything you posted to make a suggestion here, but this bothers me.. The resources should be allocated after the rdma_bind syscall, prior to listen/accept or connect, IMHO. How does tha rai-ai_src_addr

  1   2   3   4   5   6   7   8   9   10   >