Re: [ewg] [PATCH] mlx4: remove limitation on LSO header size

2009-10-04 Thread Or Gerlitz
Eli Cohen wrote: Current code has a limitation as for the size of an LSO header not allowed to cross a 64 byte boundary. This patch removes this limitation by setting the WQE RR for large headers thus allowing LSO headers of any size. The extra buffer reserved for MLX4_IB_QP_LSO QPs has been

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-06 Thread Or Gerlitz
Sean Hefty wrote: Provide an option for user's to manually specify the socket address to DGID mapping on InfiniBand. Currently, all mappings are done using ipoib, and involve ARP. This will not work across IP subnets, and alternative mechanisms of resolving the mapping are being explored.

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-07 Thread Or Gerlitz
Sean Hefty wrote: From user space, the call sequence does not change.  The user calls rdma_resolve_addr, rdma_resolve_route, rdma_connect, etc.  It is up to the librdmacm to perform the resolution.  Today, the resolution request is simply passed down to the kernel, which restricts how the

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-08 Thread Or Gerlitz
On Thu, Oct 8, 2009 at 1:42 AM, Sean Hefty sean.he...@intel.com wrote: My intent, which differs from Jason's, was to fully support the existing librdmacm interfaces as they are defined. yes, I agree this is the way to go Implementation wise, if the user of the librdmacm calls

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-08 Thread Or Gerlitz
Sean Hefty sean.he...@intel.com wrote: When used over IB, the IP address is little more than a qualifier contained within the IB CM REQ private data. If we added support for AF_GID/AF_IB to the kernel, the rdma_cm could leave all of the private data carried in the IB CM REQ entirely up to the

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-08 Thread Or Gerlitz
Jason Gunthorpe jguntho...@obsidianresearch.com wrote: If the listening side continues to use the IP mode to listen then I guess the client can compute an appropriate service ID, but it seems a bit strange for one side to use IP and the other side to use the ACM method? I was imagining you'd

Re: [ewg] rping is not resolving ipv6 addresses

2009-10-09 Thread Or Gerlitz
David J. Wilder dwil...@us.ibm.com wrote: I added an option to rping to specify a source address and supply it to patch? rdma_resolve_addr(), but now it is failing rdma_resolve_route(). $ ./rping -d  -c -v -a fe80::202:c903:1:1925 -i fe80::202:c903:1:28ed cma_event type

Re: [ewg] rping is not resolving ipv6 addresses

2009-10-09 Thread Or Gerlitz
David J. Wilder dwil...@us.ibm.com wrote: If I run rping without my rping change to add the source address to rdma_resolve_address(),  ip neigh show gives:  fe80::202:c903:1:1925 dev eth1  FAILED Notice that interface is incorrect, it should be ib0. tcpdump showed the neighbor-discovery

Re: rping is not resolving ipv6 addresses

2009-10-11 Thread Or Gerlitz
Sean Hefty wrote: The rdma cm was never fully coded or tested for ipv6 support. Sean, even if not fully coded/tested, some work has been done, e.g commits 38617c64 RDMA/addr: Add support for translating IPv6 addresses and 1f5175ad RDMA/cma: Add IPv6 support. I suggest we'll try to see what

Re: [PATCH 1/2] rdma/cm: support option to allow manually setting IB path

2009-10-13 Thread Or Gerlitz
Sean Hefty wrote: Before spending any more time on this patch series, is there any disagreement to accepting this patch (as is or slightly modified) upstream? Hi Sean, This patch just sets a route to the kernel and have the kernel issue a route resolved event in return, sounds good to me, I

Re: switching the active interface for bonding

2009-10-14 Thread Or Gerlitz
Sumeet Lahorani wrote: We are [...] trying to simulate the effect of a bonding failover initiated by a switch failure using echo commands in parallel to the /sys/class/net/bond0/bonding/active_slave file on a few of the nodes attached to the switch. Is this an acceptable technique? yes We

Re: switching the active interface for bonding

2009-10-14 Thread Or Gerlitz
Sumeet Lahorani wrote: We are using OFED 1.4.2 Please note that the bonding driver provided by the latest distros supports IPoIB. So if your distro happen to be RHEL 5.4 (or its OEL 5.4 derivative), or SLES11 you can and should use the distro provided bonding. Moving forward, OTOH customers

Re: [PATCH] librdmacm: initialize correct pthread condition in rdma_join_multicast

2009-10-22 Thread Or Gerlitz
Sean Hefty wrote: rdma_join_multicast re-initializes id_priv-cond rather than mc-cond. Fix this. Bug reported by Nir Naaman any idea what's the impact of this bug? Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to

Re: [PATCH 2/2] rdma/cm: allow user to specify IP to DGID mapping

2009-10-25 Thread Or Gerlitz
Jason Gunthorpe wrote: So why not have a more general, flexible approach? Isolating ACM from librdmacm by using AF_IB is a good idea, it keeps them seperate and lets ACM and future go where ever. I hope Sean can make it work with the rdma_getddrinfo idea, that would completely seperate ACM

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-27 Thread Or Gerlitz
Jason Gunthorpe wrote: I was saying that point in the rdmacm where the rdma_cm_id is bound to a local RDMA device should have only been rdma_resolve_addr and rdma_accept. Overloading rdma_bind_addr to both bind to an IP and bind to an RDMA device was a bad API choice. As you wrote, for the

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-28 Thread Or Gerlitz
Jason Gunthorpe wrote: Wow, seriously? You do understand the purpose of review, right? I think I do, maybe not to the depth you and your arguments are, but again, repeating myself: my kind of simple argument is that your review is way beyond the --change-- suggested by a patch but rather of a

Re: [PATCH RDMA] Fixup IPv6 support and IPv4 routing corner cases for RDMA CM

2009-10-28 Thread Or Gerlitz
Jason Gunthorpe wrote: **COMPILE TESTED ONLY** any reason why other people have to test for you? Convert the address resolution process for outgoing connections to be very similar to the way the TCP stack does the same operations. This fixes many corner case bugs that can crop up.

[PATCH RESEND] ib/iser: re-write SG handling for rdma logic

2009-11-01 Thread Or Gerlitz
-writes the logic that does the above, to make it clearer and simpler. It also fixes a bug in the being aligned for rdma checks, where a start check wasn't done but rather only end check. Signed-off-by: Alexander Nezhinsky alexand...@voltaire.com Signed-off-by: Or Gerlitz ogerl...@voltaire.com Index

Re: [PATCH v4] rdma/cm: support option to allow manually setting IB path

2009-11-01 Thread Or Gerlitz
Sean Hefty wrote: Future changes to the rdma cm can expand on this framework to support the full range of features allowed by the IB CM, such as separate forward and reverse paths and APM Sean, Before enhancing the rdma-cm to support the full feature set of the IB CM, something which I

Re: Crash in bonding

2009-11-02 Thread Or Gerlitz
Pradeep Satyanarayana wrote: This crash was originally reported against Rhel5.4. However, one can recreate this crash quite easily in OFED-1.5 too. I understand that you get the crash when working with the RHEL5.4 bonding driver, correct? does it happen only with IPoIB devices acting as the

Re: QoS in local SA entity

2009-11-07 Thread Or Gerlitz
Sean Hefty wrote: I wasn't trying to limit how the SA could 'distribute' QoS information to the end nodes. ACM will obtain QoS information from the SA when it joins its multicast groups excellent... still, this is dependent on how the ACM MGIDs are constructed, I'll take a look on the code.

Re: QoS in local SA entity

2009-11-08 Thread Or Gerlitz
Jason Gunthorpe wrote: The entire point of the rdma_getaddrinfo + AF_IB is to avoid hacking up librdmacm for every address lookup/cache scheme someone invents the entire simple point I am trying to make is that rdma_getaddrinfo + AF_INET is doable, is simple and is needed to keep up the

Re: [PATCH RESEND] ib/iser: re-write SG handling for rdma logic

2009-11-09 Thread Or Gerlitz
This patch re-writes the logic that does the above, to make it clearer and simpler. It also fixes a bug in the being aligned for rdma checks, where a start check wasn't done but rather only end check. Roland, I don't see this patch in your for-next branch, any reason not to merge this?

Re: QoS in local SA entity

2009-11-09 Thread Or Gerlitz
Jason Gunthorpe wrote: The extra info in rdma_resolve_addr2 carries the IB specific path information from the rdma_getaddrinfo module to the kernel for the address pair. The entire purpose of AF_IB is to let user space tell the kernel it does not want a kernel side ND and PR query, instead

Re: LID reconfiguration

2009-11-09 Thread Or Gerlitz
One more question; I saw librdmacm which looked nice but it does not support multi-path connections. It would eliminate a lot of code if we could use this what are your needs? Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to

Re: LID reconfiguration

2009-11-09 Thread Or Gerlitz
Jeff Roberson wrote: I would want a way to specify the alternate sockaddr with automatic failover between them. Perhaps with some notification when a failover occured From your description I still don't see what the alternate address buys you. As was suggested here, bond two IPoIB devices,

Re: Crash in bonding

2009-11-10 Thread Or Gerlitz
Pradeep Satyanarayana wrote: The crash is specific to IPoIB, and does not happen with Ethernet slaves. okay Can you explain why you plan to remove this from the newer distros? This is indeed news to me we plan to remove bonding from --ofed-- as the distro provided bonding supports ipoib,

Re: [PATCH RDMA] Fixup IPv6 support and IPv4 routing corner cases for RDMA CM

2009-11-11 Thread Or Gerlitz
Sean Hefty wrote: I'll compare my final patches against the ones submitted by David to see if anything got missed Are Jason's patches a superset of David's patches? or they need to be applied and only then David's work can be re-reviewed/merged, etc? Or. -- To unsubscribe from this list:

Re: ipath now and then (was [PATCH] IB/core: export struct ib_port)

2009-11-11 Thread Or Gerlitz
On Wed, Nov 11, 2009 at 11:06 PM, Dave Olson dave.ol...@qlogic.com wrote: And yes, the ib_ipath is being fully deprecated.  The full set of patches that adds ib_qib upstream will include a subset that drops ib_ipath.   All the bug fixes and feature work have been done for ib_qib It was brought

Re: [PATCH] librdmacm/mckey: add notifications on events

2009-11-12 Thread Or Gerlitz
Sean Hefty wrote: mckey is intended to be a fairly simple send/receive multicast test program. What's the reasoning behind adding the event handling? The librdmacm examples serve for multiple purposes, among them user education on how to write rdmacm based apps and as a vehicle to

Re: [PATCHv2] infiniband-diags/ibqueryerrors: Add support for PortXmitDiscardDetails

2009-11-14 Thread Or Gerlitz
Sasha Khapyorsky wrote: I don't think this is the forum to discuss vendor bugs. no way we can commit here a fix for undocumented bug Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at

Re: [PATCH 8/9] ib/addr: simplify resolving IPv4 addresses

2009-11-16 Thread Or Gerlitz
Sean Hefty wrote: Merge resolve local/remote address resolution into a single data flow to ensure consistent access and use of the local routing tables. Sean, I reviewed patches 1-6 8 and they all look fine, I will give the whole series a try later this week to further validate them. Based

Re: [PATCH 7/9] rdma/cm: fix loopback address support

2009-11-24 Thread Or Gerlitz
Sean Hefty sean.he...@intel.com wrote: I will create a new librdmacm package that corresponds with the changes I made all my testing of the patch set with librdmacm 1.0.10 and patched 2.6.32-rc5 kernel, where as I wrote you, I was focusing on AF_INET/PS_TCP and AF_INET/PS_IPOIB. I understand

Re: [PATCH 7/9] rdma/cm: fix loopback address support

2009-11-24 Thread Or Gerlitz
Changes were your changes to mckey, plus changes Dave added to cmatose to support IPv6.  The actual library itself hasn't been modified. okay, got it. I was under the impression that mckey still misses an option to get from the user an ipv6 multicast address which isn't all zeros nor unmapped,

Re: RDMAoE verbs questions

2009-11-24 Thread Or Gerlitz
Jeff Squyres wrote: I was reviewing Mellanox's Open MPI patches for RDMAoE support Hi Jeff, Can you send us point to the patch series (mail thread or some repository where they sit)? 1. It looks like there is a new field on the ibv_port_attr struct: transport. Is it expected that all

Re: RDMAoE verbs questions

2009-11-25 Thread Or Gerlitz
Jeff Squyres wrote: Here's one thread: http://www.open-mpi.org/community/lists/devel/2009/11/7063.php Jeff, looking on the threads you have sent, I didn't find a way to download the patch in a form which can be applied on a source tree, is there a way to do it through this archive? are these

Re: RDMAoE verbs questions

2009-11-25 Thread Or Gerlitz
Pavel Shamis (Pasha) wrote: The patch is attached Thanks, this patch basically replaces checks for the device transport type to be IB to a check that makes sure either the former happens or the port transport type is rdmaoe. As Jason, Tziporet and noted, the port transport type seems to be

Re: RDMAoE verbs questions

2009-11-26 Thread Or Gerlitz
Pavel Shamis (Pasha) wrote: The only reason for this changes is the fact that for IB devices we prefer to use our own open mpi connection managers. In case if we will decide to use RDMA-CM for all devices the number of changes will be zero... whatever, currently, this change is still there,

Re: Reliable IB connections (RC) and event ordering

2009-12-01 Thread Or Gerlitz
Roland Dreier wrote: The IBA takes into account this lack of ordering in multiple places -- defining communication established async events, etc. same goes for the IB stack... e.g take a look on the ib_cm_notify and rdma_notify APIs Or. -- To unsubscribe from this list: send the line

Re: RDMAoE verbs questions

2009-12-02 Thread Or Gerlitz
Paul Grun wrote: Why do you say that Or? I said that b/c the latest patch set posted by Mellanox doesn't support loopback, I hear now that this was a temporal limitation which will be removed, let it be. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body

Re: InfiniBand/RDMA merge plans

2009-12-08 Thread Or Gerlitz
Roland Dreier wrote: Since 2.6.31-rc8 has been out more than a week already, it's probably a good time to talk about 2.6.32 merge plans. All the pending things that I'm aware of are listed below. Hi Roland, any update on the 2.6.33 merge plans? Or. -- To unsubscribe from this list: send the

Re: [PATCH 06/11] RDMA/nes: abnormal listener termination causes loopback node crash

2009-12-09 Thread Or Gerlitz
Faisal Latif wrote: when listener is destroyed for loopback connection Does the upstream iwarp stack supports loopback connections? does it apply to all vendors? Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org

Re: RDMAoE / lossless Ethernet (ewg: SC'09 BOF - Meeting notes)

2009-12-23 Thread Or Gerlitz
Liran Liss wrote: all the rdmaoe materials saying the lossless traffic class is a must, are you saying that this works well also without it? then why from architect point of view you have posed this requirement? lossless traffic can be achieved today using global pause, for example.

[PATCH] IB/mlx4: fix post_recv wq overflow check

2009-12-23 Thread Or Gerlitz
the post recv flow should check wq overflow using the recv and not the send cq Signed-off-by: Or Gerlitz ogerl...@voltaire.com diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 989555c..2a97c96 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers

Re: RDMAoE / lossless Ethernet (ewg: SC'09 BOF - Meeting notes)

2009-12-23 Thread Or Gerlitz
Roland Dreier rdre...@cisco.com wrote: I agree that implementing DCB is important for IBoE, but why do you say that a classical ethernet fabric with global pause isn't usable?  That should be roughly equivalent to an IB fabric that uses only a single VL, which is the case for many production

Re: RDMAoE / lossless Ethernet (ewg: SC'09 BOF - Meeting notes)

2009-12-24 Thread Or Gerlitz
Roland Dreier wrote: Sure, DCB is very useful, in many environments. And maybe even a requirement sometimes. I'm simply trying to say that IBoE with classical ethernet is at least as useful as standard IB in many cases Roland, Paul, Putting a side for a moment the detailed discussion we've

Re: [PATCH] IB/mlx4: fix post_recv wq overflow check

2010-01-06 Thread Or Gerlitz
Roland Dreier wrote: thanks, applied. With this not being a regression, I see that it went into your for-next branch and as such I assume will be available by 2.6.34. Are you fine with the patch going into the -stable series? Or. -- To unsubscribe from this list: send the line unsubscribe

Re: [PATCHv7 4/9] ib_core: RoCEE CMA device binding

2010-01-07 Thread Or Gerlitz
Eli Cohen wrote: +static int cma_resolve_rocee_route(struct rdma_id_private *id_priv) [...] + route-path_rec-hop_limit = 2; why? does this value has any specific meaning? + route-path_rec-mtu_selector = 2; all the xxx_selector usages in this code should be transformed to be from the

upstream mlx4/ib/4K mtu support

2010-01-11 Thread Or Gerlitz
Hi Vlad, I came across this ofed patch which isn't upstream. Is it a must for making mlx4/ib/4K mtu working? was it rejected from upstream? why? Or. mlx4/IB: Add set_4k_mtu module parameter. It control Infiniband link MTU for all IB ports in a host. Signed-off-by: Vladimir Sokolovsky

Re: RDMA Read sge errors

2010-01-11 Thread Or Gerlitz
Jack, I see now that commit cd155c1 IB/mlx4: Fix creation of kernel QP with max number of send s/g entries is mainstream but not ofed 1.4.x and that mlx4_0090_fix_sq_wrs.patch (below) is in ofed but not mainstream, was it rejected from the mainline kernel? why? Or. 1. Limit qp resources

Re: [PATCH 1/3] rdma_cm: Add support for a new RDMA_PS_LUSTRE Lustre port space

2010-01-14 Thread Or Gerlitz
sebastien dugue wrote: That can be done with port numbers, except that we cannot separate traffic to Lustre MDS and traffic to Lustre OSS Looking on these patches and going with you for a minute, I don't see how this patch set serves you to assign a different QoS level (e.g SL) to MDS vs OSS

clarification on the mlx4 CQE structure

2010-01-19 Thread Or Gerlitz
Hi Yevgeny, looking on commit f780a9f mlx4_core: Add ethernet fields to CQE struct I see the following two changes: @@ -692,14 +692,13 @@ repoll: - wc-sl = cqe-sl 4; + wc-sl = be16_to_cpu(cqe-sl_vid 12); I wasn't sure if/why a conversion from

Re: clarification on the mlx4 CQE structure

2010-01-19 Thread Or Gerlitz
Yevgeny Petrilin wrote: This commit has an endianess bug, that was fixed in commit f781a22f. The cqe-sl_vid field is a be16, so we needed to convert the sl value to host order. Before the commit this field was two u8 fields, so no conversion was needed okay, got it, thanks Or. -- To

Re: [PATCH] IB/mlx4: fix post_recv wq overflow check

2010-01-19 Thread Or Gerlitz
Roland Dreier wrote: I do think it is quite common to see this WQ overflow check trigger, even for kernel code mmm, why is that common? typically there's a higher layer to which the IB ULP advertises some sort of maximal number of credits (e.g in the SCSI case, iser and srp specify the

Re: [PATCH 1/3] rdma_cm: Add support for a new RDMA_PS_LUSTRE Lustre port space

2010-01-20 Thread Or Gerlitz
sebastien dugue wrote: So I guess you need to change the ports used within the new port space -- but then why can't you just stay in the TCP space but change the ports used? No, with the new port space, there's no need to change ports. You only need to specify the target GUIDs. For

Re: [PATCH 1/3] rdma_cm: Add support for a new RDMA_PS_LUSTRE Lustre port space

2010-01-20 Thread Or Gerlitz
sebastien dugue wrote: No, because in OpenSM's QoS logic, there's no way to map the TCP port space with specific target GUIDs onto an SL. You have keywords for SDP, SRP, RDS, ISER, ... but not for the TCP port space (or am I missing something?). going with this, what prevents you from

Re: [PATCH] IB/mlx4: fix post_recv wq overflow check

2010-01-20 Thread Or Gerlitz
Roland Dreier wrote: In other words this check catches common bugs and makes them a gazillion times easier to find and fix. So unless the performance impact is extreme, I'm inclined to leave it okay, lets leave this like that for unless someone comes with performance data that shows this is

Re: [PATCH] ib/ipoib: remove TX moderation from the ethtool related code

2010-01-20 Thread Or Gerlitz
Or Gerlitz wrote: As of commit f56bcd8 IPoIB: Use separate CQ for UD send completions, there are no TX interrupts at the main code path. Change the ethtool related code to comply with this, such the users will not be misleaded to assume they can control TX interrupt moderation. Hi Roland, did

Re: rdma_bind failure over iWarp

2010-01-20 Thread Or Gerlitz
Woodruff, Robert J wrote: [wo...@det-17 src]$ ucmatose -b 192.168.0.17 cmatose: starting server cmatose: bind address failed: No such file or directory return status -1 A case were rdma_bind returns -ENOENT was debugged here this week with the problem being the same IP assigned to two

Re: [PATCH 1/3] rdma_cm: Add support for a new RDMA_PS_LUSTRE Lustre port space

2010-01-21 Thread Or Gerlitz
sebastien dugue wrote: OK, then going with the TCP port space, what we need in OpenSM is a combination of service id (TCP) _and_ TCP port _and_ target GUID. I believe that you can have a 'lustre' keyword in opensm qos parser which stands for the combination of tcp port space + lustre tcp port

Re: ibv_asyncwatch and buffering

2010-01-21 Thread Or Gerlitz
Håkon Bugge wrote: That would make ibv_asyncwatch more useful in scripted environments patch? Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH] ib/ipoib: remove TX moderation from the ethtool related code

2010-01-23 Thread Or Gerlitz
Roland Dreier wrote: Yes, looks fine, planning to merge it for 2.6.34 okay, good, I see that the for-next branch of yours is updated and already contains one patch. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to

Re: [PATCH] libibverbs: Force line-buffering in ibv_asyncwatch

2010-01-25 Thread Or Gerlitz
Håkon Bugge wrote: I used the information at www.openfabrics.org/git/?p=ofed_1_2_5/libibverbs.git;a=summary which states the owner to be Vlad. May be that confused me. I'll send a copy to Roland Roland's user space git trees are all hosted @ kernel.org the libibverbs one is

Re: ib_write_bw hanging when using max max_inline value

2010-01-25 Thread Or Gerlitz
Håkon Bugge wrote: The capabilities in qp_init_attr used as input to ibv_create_qp() are: max_send_sge = 1, max_recv_sge = 1, max_inline_data = 928 Upon return the capabilities are modified to the following max_send_sge = 32, max_recv_sge = 1, max_inline_data = 928 Note decreasing the size of

Re: [PATCH 2/4] IB/uverbs: Support for XRC

2010-01-26 Thread Or Gerlitz
Roland Dreier wrote: Add support for core userspace XRC operations (alloc/dealloc XRC domain, create XRC SRQ), including adding an ABI for marshalling requests and responses. +++ b/include/rdma/ib_user_verbs.h @@ -81,7 +81,10 @@ enum { - IB_USER_VERBS_CMD_POST_SRQ_RECV +

DREQ timeout for rdma-cm consumers

2010-01-26 Thread Or Gerlitz
Hi Sean, I'm trying to understand what is the time out (e.g for DREQ) used by the ib cm when called by the rdmacm through rdma_connect. 1st, going empirically it looks like 100 seconds pass between a call to rdma_disconnect and getting RDMA_CM_EVENT_DISCONNECTED after taking the relevant IB port

Re: [PATCH 2/4] IB/uverbs: Support for XRC

2010-01-27 Thread Or Gerlitz
Roland Dreier wrote: OK, but we should coordinate this with all the other ABI extensions that OFED has already made I believe the only thing I need to coordinate with is the XRC patch set. Jack, can you please direct me to the patch set I should be setting the modify_cq patches against? no

Re: DREQ timeout for rdma-cm consumers

2010-01-28 Thread Or Gerlitz
Sean Hefty wrote: I believe that the IB timeout of 20 is about 4 seconds. If the packet lifetime is 1 second, then each try will take 6 seconds to timeout. For 15 retries, this is close to 100 seconds. okay, thanks for explaining this. You should be able to destroy the rdma_cm_id at

[PATCH 0/8] ib/iser: major face lift of the data path code

2010-02-03 Thread Or Gerlitz
The following patch set removes some in efficiencies in the iser data path through simplification and reducing the amount of code, using less atomic operations, avoiding TX interrupts, moving to iscsi passthrough mode, etc. I did my best to build it as a sequence of patches and not as one big

[PATCH 01/9] ib/iser: revert commit bba7ebb avoid recv buffer exhaustion

2010-02-03 Thread Or Gerlitz
towards a major change in the recv buffer posting logic, with which the problem commit bba7ebb avoid recv buffer exhaustion caused by unexpected PDUs comes to solve doesn't exist any more, revert it. Signed-off-by: Or Gerlitz ogerl...@voltaire.com CC: David Disseldorp dd...@sgi.com CC: Ken

[PATCH 02/9] ib/iser: new recv buffer posting logic

2010-02-03 Thread Or Gerlitz
path. Use a pre-allocated ring of recv buffers instead of allocating from kmem cache. A special treatment is given to the login response buffer whose size must be 8K unlike the size of buffers used for any other purpose which is 128 bytes. Signed-off-by: Or Gerlitz ogerl...@voltaire.com

[PATCH 03/9] ib/iser: remove atomic counter for posted recv buffers

2010-02-03 Thread Or Gerlitz
With both the posting and reaping of recv buffers being in the completion path, their outstanding number counter need not be atomic. Signed-off-by: Or Gerlitz ogerl...@voltaire.com --- drivers/infiniband/ulp/iser/iscsi_iser.h |2 +- drivers/infiniband/ulp/iser/iser_initiator.c |6

[PATCH 04/9] ib/iser: use different CQ for send completions

2010-02-03 Thread Or Gerlitz
Use a different CQ for send completions, where send completions are being polled by the interrupt driven recv completion handler. As such, interrupts aren't used for the send CQ. Signed-off-by: Or Gerlitz ogerl...@voltaire.com --- drivers/infiniband/ulp/iser/iscsi_iser.h |3 drivers

[PATCH 05/9] ib/iser: simplify send flow/descriptors

2010-02-03 Thread Or Gerlitz
Simplify and shrink the logic/code used for the send descriptors. Changes include removal of struct iser_dto which is unnecessary abstraction, use struct iser_regd_buf only for handling SCSI commands, use dma_sync instead of dma_map/unmap, etc. Signed-off-by: Or Gerlitz ogerl...@voltaire.com

[PATCH 06/9] ib/iser: use atomic allocations

2010-02-03 Thread Or Gerlitz
Two minor flows in iser's data path still use allocations, move them to be atomic as a preperation step towards moving to use libiscsi passthrough flow. Signed-off-by: Or Gerlitz ogerl...@voltaire.com --- drivers/infiniband/ulp/iser/iser_initiator.c |2 +- drivers/infiniband/ulp/iser

[PATCH 07/9] ib/iser: remove unnecessary connection checks

2010-02-03 Thread Or Gerlitz
-by: Or Gerlitz ogerl...@voltaire.com --- drivers/infiniband/ulp/iser/iscsi_iser.h |3 -- drivers/infiniband/ulp/iser/iser_initiator.c | 38 --- drivers/infiniband/ulp/iser/iser_verbs.c | 11 --- 3 files changed, 52 deletions(-) Index: linux-2.6.33-rc4/drivers

[PATCH 08/9] ib/iser: move to use libiscsi passthrough mode

2010-02-03 Thread Or Gerlitz
is removed. Signed-off-by: Or Gerlitz ogerl...@voltaire.com --- drivers/infiniband/ulp/iser/iscsi_iser.c |2 +- drivers/infiniband/ulp/iser/iser_initiator.c | 12 2 files changed, 1 insertion(+), 13 deletions(-) Index: linux-2.6.33-rc4/drivers/infiniband/ulp/iser

Re: [PATCH 08/9] ib/iser: move to use libiscsi passthrough mode

2010-02-04 Thread Or Gerlitz
by the passthrough flow of libiscsi. Since the queue/worker aren't used in this mode, the code that schedules the xmitworker is removed. Signed-off-by: Or Gerlitz ogerl...@voltaire.com --- changes from V1: - remove calls to iscsi_conn_failure which are under now both buggy and not needed, instead

Re: [PATCH 05/9] ib/iser: simplify send flow/descriptors

2010-02-04 Thread Or Gerlitz
Or Gerlitz wrote: Simplify and shrink the logic/code used for the send descriptors. Changes include removal of struct iser_dto which is unnecessary abstraction, use struct iser_regd_buf only for handling SCSI commands, use dma_sync instead of dma_map/unmap, etc. it turns out that bunch

[PATCH] rdma/nes: change WQ overflow return code

2010-02-04 Thread Or Gerlitz
change the nes driver to return -ENOMEM on SQ/RQ overflow in the manner done by other rdma hw drivers (e.g cxgb3, ehca, mlx4, mthca) Signed-off-by: Or Gerlitz ogerl...@voltaire.com Index: linux-2.6.33-rc4/drivers/infiniband/hw/nes/nes_verbs.c

Re: [PATCH 0/8] ib/iser: major face lift of the data path code

2010-02-04 Thread Or Gerlitz
Bart Van Assche wrote: Sounds really interesting. Do you have numbers available about how much these patches improve throughput or decrease latency ? Yes, generally speaking after the patches the initiator peaks to about 300-400K IOPS with latency under such load being 20-30us and before the

Re: [PATCH 0/8] ib/iser: major face lift of the data path code

2010-02-07 Thread Or Gerlitz
Vladislav Bolkhovitin wrote: Or Gerlitz wrote: From where did you get those latency numbers? read iostat(8), you'll see that await is The average time (in milliseconds) for I/O requests issued to the device to be served what kind of test did you do? I connected a Linux box through iser

[PATCH V2 02/9] ib/iser: new recv buffer posting logic

2010-02-08 Thread Or Gerlitz
path. Use a pre-allocated ring of recv buffers instead of allocating from kmem cache. A special treatment is given to the login response buffer whose size must be 8K unlike the size of buffers used for any other purpose which is 128 bytes. Signed-off-by: Or Gerlitz ogerl...@voltaire.com

[PATCH V2 03/9] ib/iser: remove atomic counter for posted recv buffers

2010-02-08 Thread Or Gerlitz
With both the posting and reaping of recv buffers being in the completion path, their outstanding number counter need not be atomic. Signed-off-by: Or Gerlitz ogerl...@voltaire.com --- drivers/infiniband/ulp/iser/iscsi_iser.h |2 +- drivers/infiniband/ulp/iser/iser_initiator.c |6

[PATCH V2 04/9] ib/iser: use different CQ for send completions

2010-02-08 Thread Or Gerlitz
Use a different CQ for send completions, where send completions are being polled by the interrupt driven recv completion handler. As such, interrupts aren't used for the send CQ. Signed-off-by: Or Gerlitz ogerl...@voltaire.com --- drivers/infiniband/ulp/iser/iscsi_iser.h |3 drivers

[PATCH V2 05/9] ib/iser: simplify send flow/descriptors

2010-02-08 Thread Or Gerlitz
Simplify and shrink the logic/code used for the send descriptors. Changes include removal of struct iser_dto which is unnecessary abstraction, use struct iser_regd_buf only for handling SCSI commands, use dma_sync instead of dma_map/unmap, etc. Signed-off-by: Or Gerlitz ogerl...@voltaire.com

[PATCH V2 06/9] ib/iser: use atomic allocations

2010-02-08 Thread Or Gerlitz
Two minor flows in iser's data path still use allocations, move them to be atomic as a preperation step towards moving to use libiscsi passthrough mode. Signed-off-by: Or Gerlitz ogerl...@voltaire.com --- drivers/infiniband/ulp/iser/iser_initiator.c |2 +- drivers/infiniband/ulp/iser

[PATCH V2 09/9] remove redundant locking from iser scsi command response flow

2010-02-08 Thread Or Gerlitz
to the scsi command completion flows. Signed-off-by: Or Gerlitz ogerl...@voltaire.com Reviewed-by: Mike Christie micha...@cs.wisc.edu --- drivers/infiniband/ulp/iser/iser_initiator.c | 25 - 1 file changed, 25 deletions(-) Index: linux-2.6.33-rc7/drivers/infiniband/ulp

[PATCH RESEND V2 09/9] ib/iser: remove redundant locking from iser scsi command response flow

2010-02-08 Thread Or Gerlitz
to the scsi command completion flows. Signed-off-by: Or Gerlitz ogerl...@voltaire.com Reviewed-by: Mike Christie micha...@cs.wisc.edu --- resending with a fixed subject line which contains the ib/iser: prefix drivers/infiniband/ulp/iser/iser_initiator.c | 25 - 1 file

Re: [PATCH V2 00/9] ib/iser: major face lift of the data path code

2010-02-09 Thread Or Gerlitz
Or Gerlitz wrote: I'd be happy to get this in for the 2.6.34 merge window which is coming quite soon Roland, looking on your for-next branch I see that it contains single one liner patch RDMA/cxgb3: Remove BUG_ON() on CQ rearm failure for 2.6.34 while there are couple of patches targeted

Re: [PATCH 00/23 v3] mlx4: multi-function framework and Ethernet SRIOV

2010-02-09 Thread Or Gerlitz
Yevgeny Petrilin yevge...@mellanox.co.il wrote: This is the third version of these patches, the main changes from previous time: Yevgeny, I just realized that you've posted this series to gene...@lists.openfabrics.org which is not active any more, and as such, this patch series wasn't posted

Re: Simplified iWARP Consumer Library

2010-02-15 Thread Or Gerlitz
Philip Frey wrote: as announced last week (on gene...@lists.openfabrics.org) [...] I would be very interested in your feedback! The general list isn't functional since last fall, as of such, your announcement wasn't seen by any of the non directly CCed recipients... can you please resend it

Re: is it possible to avoid syncing after an rdma write?

2010-02-17 Thread Or Gerlitz
Andy Grover wrote: RDS follows each RDMA write op with a Send op [...] we want to omit the Send Andy, This way or another the side which isn't initiating the rdma write has to be notified that the local buffer rkey (stag) they advertised can now invalidated from the HCA/RNIC IOMMU, its

Re: [net-next-2.6 PATCH] infiniband: convert to use netdev_for_each_mc_addr

2010-02-28 Thread Or Gerlitz
Jason Gunthorpe wrote: Jiri Pirko wrote: when bonding changes it's type, flush mc addresses and start over. There was a patch posted that tried to do something like what you are describing Indeed, Jason, commit 75c785 bonding: remap muticast addresses without using dev_close() and

Re: Setting QP attributes with RDMA CM

2010-03-03 Thread Or Gerlitz
Todd Strader wrote: I'm using the RDMA CM to set up a QP and I'm trying to figure out if I can suggest QP attributes to it before it transitions through all the states see rdma_connect(3) Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to

[PATCH 1/2] ib/ipoib: allow disabling/enabling TSO through ethtool

2010-03-04 Thread Or Gerlitz
allow disabling/enabling TSO on the fly by ethtool Signed-off-by: Or Gerlitz ogerl...@voltaire.com --- drivers/infiniband/ulp/ipoib/ipoib_ethtool.c | 19 +++ 1 file changed, 19 insertions(+) Index: linux-2.6.33/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c

[PATCH 2/2] ib/ipoib: include err code in trace message for ib_post_send() failures

2010-03-04 Thread Or Gerlitz
print the return code of ib_post_send() if it fails to help debug errors Signed-off-by: Or Gerlitz ogerl...@voltaire.com --- drivers/infiniband/ulp/ipoib/ipoib_cm.c |8 +--- drivers/infiniband/ulp/ipoib/ipoib_ib.c |9 + 2 files changed, 10 insertions(+), 7 deletions

Re: IPoIB issues

2010-03-10 Thread Or Gerlitz
Eli Cohen wrote: The patch does not address these failures directly but maybe as a side effect they would go away too. The patch seems to solve a case of possible live lock happening in a node which has both CM and datagram neighbors e.g where ipoib have called netif_stop etc but there is now

[PATCH] librdmacm: document/clarify the delivery of connection established event

2010-03-18 Thread Or Gerlitz
Applications based on the rdma-cm may assume that established event is always delivered by the the kernel stack, clarify that. Signed-off-by: Or Gerlitz ogerl...@voltaire.com diff --git a/man/rdma_notify.3 b/man/rdma_notify.3 index 7114ac4..82e1008 100644 --- a/man/rdma_notify.3 +++ b/man

Unicast, no dst warning from IPoIB

2010-03-22 Thread Or Gerlitz
Roland, Recently (e.g now with 2.6.34-rc2) I came across this warning from ipoib_start_xmit. I wasn't sure if it suggests that there's some real problem or not. It happens few times and then vanishes, for some reason the type is always 0002 (ETH_P_AX25) ib0: Unicast, no dst: type 0002,

  1   2   3   4   5   6   7   8   9   10   >