Eli Cohen wrote:
Could you send a link to the git tree where I can find this commit and
the related fixes?
basically, as the subject line suggests, it should be in Dave's net-next tree
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to
Roland,
This patch series fixes and reduces DM multipath fail-over / time
over iscsi/iser, the core patch is #3.
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at
to the transport back-pointer, as
it may already point to a different transport connection.
Signed-off-by: Or Gerlitz ogerl...@voltaire.com
---
drivers/infiniband/ulp/iser/iser_verbs.c |2 --
1 file changed, 2 deletions(-)
Index: linux-2.6.34-rc6/drivers/infiniband/ulp/iser/iser_verbs.c
, terminate
2. conn bind, stop/destroy
3. cma id create, disconnect/error/timeout callbacks
Signed-off-by: Or Gerlitz ogerl...@voltaire.com
---
with this patch, multipath fail-over time is about 30 seconds,
which is seen here, when a DD over the multi-path device is done
before/during/after the fail
Roland Dreier wrote:
+CXGB4 ETHERNET DRIVER (CXGB4)
not sure who's the butterfly that caused this, but this was somehow
committed as CXGB4 ETHERNET DRIVER (CXGB3) and same goes for the IW_
piece
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a
() = iser_route_handler() = iser_create_ib_conn_res()
if we fail here, eventually iser_conn_release() is called, resulted in double
free.
Signed-off-by: Dan Carpenter erro...@gmail.com
Signed-off-by: Or Gerlitz
Or Gerlitz ogerl...@voltaire.com wrote:
[...] with this patch, multipath fail-over time is about 30 seconds, which
is seen here,
when a DD over the multi-path device is done before/during/after the
fail-over [...] without
this patch, multipath fail-over time is about 130 seconds
Hi
Roland Dreier rdre...@cisco.com wrote:
I have these 3 + Dan Carpenter's fix applied now.
cool
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Eli Cohen wrote:
Roland Dreier wrote:
@@ -1007,7 +1010,7 @@ static void ib_sa_add_one(struct ib_device *device)
- sa_dev = kmalloc(sizeof *sa_dev +
+ sa_dev = kzalloc(sizeof *sa_dev +
Do you happen to remember why you needed these kmalloc - kzalloc conversions?
I can't remember
Eli Cohen wrote:
Roland Dreier wrote:
Why do we not allow umad for IBoE ports? I understand there's no QP0 but why
can't userspace use QP1 just like for IB link layer ports?
Currently QP1 is only used by the CM protocol which is implemented in the
kernel. Since we handle the iboe
Sean Hefty wrote:
I've pushed out release 1.0.12 of librdmacm.
Hi Sean, below is a tiny patch which will help direct users to the correct
mailing list
set the mailing list info to be linux-rdma instead of the ofa general list
signed-off-by: Or Gerlitz ogerl...@voltaire.com
diff --git
Moni Shoua wrote:
Did you try OFED-1.5.1 or even better, OFED-1.5.2? I know patches for counters
with RoCEE were submitted since OFED-1.5 and I saw it working
Mony, I'm not using ofed, sorry... I am interested in a clarification in
the context of the upstream submission, e.g does the problem
Eli Cohen wrote:
counter should work as regular in upstream kernel patches for IB link layer.
okay good, can you validate that? basically, I can set some time to clone
Roland's tree
and use the iboe branch as a basis for testing that the IB stack is live and
kicking as it used to be before
Eli Cohen wrote:
Why are you asking me to validate that? Did you actually encounter a problem
with this?
yes, I did. It didn't work with some ofed drop I was using. Anyway, as I
said, I can do some validation that IBoE doesn't break upstream IB, just
need the patches for that end, so once
On Fri, Jun 11, 2010 at 3:47 PM, Chien Tung chien.tin.t...@intel.com wrote:
V2 changes:
What you consider to be V1, this thread from 2007?
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at
Walukiewicz, Miroslaw miroslaw.walukiew...@intel.com wrote:
The patch adds a new test application describing a usage of the
IBV_QPT_RAW_ETH
for IPv4 multicast acceleration on iWARP cards. See man mcraw for parameters
description
So this is the only raw qp related patch to librdmacm? any
Mike Heinz wrote:
This patch fixes a problem with the openibd initialization script.
On machines using slower DHCP servers, openibd frequently sets the HCA's node
description
to HCA-1. This patch modifies openibd to add a @ instead of the hostname
and adds a
small hook in the core
Walukiewicz, Miroslaw wrote:
The mckey works on UD_QP type and mcraw works on RAW_QP type.
The data payload prepared for UD and RAW_QP are on different layers.
The mckey uses rdma_join_multicast() that triggers a state machine for IB
multicast joining.
The mcraw does not trigger such
Hefty, Sean wrote:
The index isn't guaranteed to be the same across all nodes. If a consumer is
going to manually control this, they should really be forced to use the actual
pkey.
yes, I saw this confusion in action, for most users pkey index doesn't
mean anything, it may also change across
Jason Gunthorpe wrote:
Be aware that mainline and OFED are different in this regard, OFED overrides
the pkey unconditionally for multicast addresses, while mainline doesn't
Can you clarify this, please?
ipoib bonding had much the same problem with invalid maddrs, and a
patch was put in
Hi Yevgeny, Roland
I wonder if you can spare few words what would be the correct location
of the PCI Id table under the two tier architecture of the mlx4 driver?
If the table is placed in mlx4_core (as of today in upstream), then I
assume the mlx4_en and _ib aren't being probed by pci hot-plug
Jason Gunthorpe wrote:
OFED works on kernels that have compiled-in inline'd multicast map functions
that do not include the pkey copy, while mainline's multicast map functions do.
So to work around this there is a bit of code in OFED to overwrite the pkey in
the multicast hw address. This
ipoib child entries non-world writable
Sumeet Lahorani sumeet.lahor...@oracle.com reported that the ipoib
child entries are world writable, fix them to be root only writable
Signed-off-by: Or Gerlitz ogerl...@voltaire.com
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c
b/drivers
Roland Dreier wrote:
I think the current upstream location is correct. This matches the practice of
eg iw_cxgb3 as well as cxgb3i, bnx2i etc. This does have the disadvantage that
mlx4_en and mlx4_ib are not auto-loaded by PCI hotplug, but so it goes.
okay. Still, its too bad that ofed ships
miroslaw.walukiew...@intel.com wrote:
adds a IB_QPT_RAW_PACKET QP type implementation for nes driver
+++ b/drivers/infiniband/hw/nes/nes_ud.c
+static const struct file_operations nes_ud_sksq_fops = {
+ .owner = THIS_MODULE,
+ .open = nes_ud_sksq_open,
+ .release =
Sumeet Lahorani wrote:
# find /sys -type f -perm -222
/sys/devices/pci:00/:00:04.0/:13:00.0/port_trigger
/sys/devices/pci:00/:00:04.0/:13:00.0/mlx4_port2
/sys/devices/pci:00/:00:04.0/:13:00.0/mlx4_port1
Jack, Tziporet
Can you clarify the status of the
Liran Liss wrote:
but keeping ib_create_ah() callable from any context is not a goal by itself.
going with your approach, if your proposed design is accepted, I believe that
you probably need to patch all the code-chains that makes calls under the
current assumption
I am looking for
Tziporet Koren wrote:
Jack is on vacation and will be back in 2 weeks. I will ask him to look at this
when he is back
All this could have been much simpler if Yevgeny was responding, he's
signed on the multi-protocol related patches shipped with ofed. So far,
I had hard time getting responses
Roland Dreier rdre...@cisco.com wrote:
thanks, applied
I don't see it, and none of the other patches you accepted last night,
in the for-next brach of yours, where are they...?
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to
Davis, Arlin R wrote:
There is limited debug in the non-debug builds. If you want full debugging
capabilities
you can install the source RPM and configure and make as follows [..] (OFED
target example):
okay, got that, once I built the sources by hand as you suggested I could see
debug
Jack Morgenstein wrote:
The sysfs entries you refer to are introduced in commit
7ff93f8b7ecbc36e7ffc5c11a61643821c1bfee5
which patches in ofed but not upstream are you referring to?
Hi Jack,
I took another look, indeed the mlx4_port{1,2} sysfs entries are introduced in
the commit
you
I don't think there are applications around which would use raw qp AND
are linked against libibverbs-1.0, such that they would exercise the 1_0
wrapper, so we can ignore the 1st allocation, the one at the wrapper code.
As for the 2nd allocation, since a WQE --posting-- is synchronous,
using
Walukiewicz, Miroslaw wrote:
I agree with you that it is possible to fix the post_send path in OFED.
Let me think a few days yet.
Hi Mirek,
okay.
Just one comment, the way I see it, ofed is very much not something that has
post_send path, its a temporary, ad-hock, very far from being well
Today, the kernel neighbouring maintainance state-machine / engine
doesn't come into play for neighbours created on behalf of rdma-cm
consumers. This is b/c the send path is offloaded away from the
network-stack to the app QP, and as such the neighbour created
follwing the ARP request / reply
Josh England wrote:
It may be that the in-kernel field cm_id_priv has a NULL -alt_av.port ,
causing the Oops, but I don't know for sure. Any ideas on how to debug this?
seems like this was reported in the past but remained unsolved,
Jason Gunthorpe wrote:
It is a bit wider problem than just ND entries, changes in routing can
also alter the L2 address, so that needs to be tracked as well.
sure, when we did the address change work, see commit dd5bdff RDMA/cma: Add
RDMA_CM_EVENT_ADDR_CHANGE event, the problem I wanted to
drivers/infiniband/ulp/iser/iser_initiator.c
iser_initiator.c:173:5: warning: symbol 'iser_alloc_rx_descriptors' was not
declared. Should it be static?
Signed-off-by: Or Gerlitz ogerl...@voltaire.com
I didn't address these two
CHECK drivers/infiniband/hw/cxgb3/iwch_cq.c
drivers
Bob Ciotti wrote:
Maybe someone on the voltaire side can help.
I'm working the issue now Wed Jul 21 00:34:14 PDT 2010
Hi Bob,
I understand that some folks from Voltaire are working with you directly.
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a
Josh England wrote:
Do you think upgrading to OFED-1.5.1 would help at all?
it might help you to diagnose the problem better, if you read through the
thread I pointed on (its very short, four messages, let then two minutes),
you would see that Arthur is reporting on the lap_state and Sean is
Steve Wise wrote:
The cxgb3/4 drivers do not set IFF_NOARP and rely on ND being done as
part of connection setup. The driver will initiate ND if there isn't a
neigh entry available at the time the iwarp driver tries to send a SYN or
SYN/ACK.
okay, understood, thanks for clarifying this
Jason Gunthorpe wrote:
I'm thinking something like this..
- The RDMA CM gets the dst from its route lookup locks it and stores it.
- Instead of doing a route lookup cxgb gets the dst from RDMA CM,
locks it and stores it
- RDMA CM traps all notifications/etc and generates callback to cxgb
Hari Subramoni wrote:
[subra...@amd6 perftest]$ ./ib_rdma_bw -c 172.16.1.5
11928: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 |
duplex=0 | cma=1 |
11928: Local address: LID , QPN 00, PSN 0x5bfbba RKey 0x90042602
VAddr 0x002b27feabe000
11928: Remote address:
Hari Subramoni subra...@cse.ohio-state.edu wrote:
The nodes have LID's assigned to them and OpenSM is running fine.
I've attached the configurations of the two hosts along with this e-mail.
As Jonathan mentioned, we are able to ping between them.
are the two HCAs on each of the nodes
Eldad Zinger wrote:
event.status = ib_event-param.sidr_rep_rcvd.status
event.status = ib_event-param.rej_rcvd.reason
event.status should be 0 for success, or negative value of generic error code.
In that code, the error code is positive and do not comply with generic error
code.
Basically,
For user space, I would add a comment in the man pages
[PATCH] librdmacm/man: document status field semantics for rejected event
document status being the IB reject reason for RDMA_CM_EVENT_REJECTED event
Signed-off-by: Or Gerlitz ogerl...@voltaire.com
diff --git a/man/rdma_get_cm_event.3 b
enhance the cq arming code to support IB_CQ_REPORT_MISSED_EVENTS
Signed-off-by: Or Gerlitz ogerl...@voltaire.com
I noted that the IB_CQ_REPORT_MISSED_EVENTS flag was added in the same cycle
with mlx4
and maybe as of this, mlx4 didn't implement the flag, which is used by IPoIB
The patch
Eli Cohen wrote:
returning 1 means that you must poll the CQ to avoid a race condition
which is not true for mlx4.
makes sense, thanks for clarifying that.
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More
Hefty, Sean wrote:
The original intent was to expose the transport specific status values to the
user,
rather than trying to map them.
yes, this makes sense, are you okay with documenting that, e.g in the spirit of
the patch I sent?
Or.
--
To unsubscribe from this list: send the line
Walukiewicz, Miroslaw wrote:
Hello Roland, What about a series from Aleksey Senin [...] And my patch
RDMA/nes: IB_QPT_RAW_PACKET QP type support for nes driver
https://patchwork.kernel.org/patch/110252
Hi Mirek,
Reading your response @ http://marc.info/?l=linux-rdmam=127954552519544
to the
Robert Pearson wrote:
Several new opcodes have been added since the last time ib_pack.h was updated.
These changes add them.
+++ b/include/rdma/ib_pack.h
+ IB_OPCODE_CN= 0x80,
+ IB_OPCODE_XRC = 0xA0,
Is this tied to
Bob Pearson wrote:
My interest is supporting the rxe driver, a software implementation of
the IB transport over Ethernet, [...] I spent a little time looking at
trying to exploit congestion notification to see if it would bu useful in
this context.
Hi Bob,
As the IB congestion control /
Bob Pearson wrote:
I was wondering if I could use this to cause ConnectX RDMAoE senders to slow down
in response to these packets. There is a challenge managing fast ROCE senders
in networks that may not fully implement per priority pause.
Hi Bob,
QCN (IEEE 802.1 based Ethernet congestion
Faisal Latif wrote:
During a stress testing in a large cluster, multiple close event is detected
and BUG() is hit in core. The cause is [...]
Do you refer to the core of the IB stack? if not, to whose core?
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body
Hefty, Sean wrote:
Does anyone have a system with multiple HCAs that's running a recent upstream
kernel?
Oracle has reported a bug connecting between two HCAs in the same system
using the rdma_cm
Sean,
With 2.6.35, I was hitting the reported failure (address error event, status
Latif, Faisal wrote:
BUG() was in iw_cm.ko in its close handler mentioned as core in my email and
caused by iw_nes.ko.
I see, looks like iwcm.c accounts for most of the BUG* calls made from the
core, could be nice
to reduce them over time.
Or.
# grep -n BUG drivers/infiniband/core/*.c |
Jason Gunthorpe wrote:
[...] The socket that is bound to a device will then use its device for
sending,
but other sockets not bound to devices will do route lookups and use the lo
device.
Do: [...] To see the difference in each side.
sure, makes sense, the ping-reply code does route
Jason Gunthorpe wrote:
As for the original issue we were discussing here, the conclusion is that with
upstream 2.6.35 bits for the rdma connection to go from hca1 port1 to hca1
port2 (or from hca1 port1 to hca2 port1), the rdma-cm needs a neighbour,
similarly to a ping -I ib0 to ib1 address.
Bob Pearson wrote:
I was curious to see if I could force a ConnectX device to slow down from a
remote application.
But since the MADs have been crippled for IBOE there is no way to configure
it.
QP1 MADs are working for ConnectX, e.g the IB CM is fully functional for IBoE,
and I don't
Hi Andy,
Some clarifications/questions from whatever quick look one can have over 107
patches...
Zach Brown's RDS/IB: print IB event strings as well as their number - commit
1bde04a63d532c2540d6fdee0a661530a62b1686 in net-next-2.6 looks perfect to
reside as a helper function in the core IB
Hi Andy, looking on this net-next-2.6 patch, I wonder if you can elaborate on
your significantly helps performance comment - what improvement you see with
this patch?
What about the QP/CQ memory, are they better be placed in node-local to the HCA
manner?
Or.
commit
Andrew Grover wrote:
Once net-next gets pushed to mainline and Roland pulls from that,
then we'll be in a good position to put these helpers where they should go,
and change other ULPs to use them.
Andy, as Roland commented, you can push such helpers through Dave once Roland
made a review
Sumeet sumeet.lahor...@oracle.com wrote:
It turns out that this problem was being caused because we had multiple IPs
configured on the bonded infiniband interface. It appears that grat. arps are
being sent out for only one of those IPs. [...] Can the bonding
driver be fixed to send out grat
Sumeet sumeet.lahor...@oracle.com wrote:
It turns out that this problem was being caused because we had multiple IPs
configured on the bonded infiniband interface. It appears that grat. arps are
being sent out for only one of those IPs. [...] Can the bonding
driver be fixed to send out grat
Hi Jack,
I just came across this patch of yours which was placed in ofed 1.5.2, I didn't
see any track of it
here @ linux-rdma (any specific reason for that?) - some questions/issues to
discuss -
1st and most, (say) for 1k node cluster, is it correct that for each node doing
start/restart
Jack Morgenstein wrote:
I have not yet submitted the patch to the list.
sounds like its about time to do that... could you send this to
review/merge into 2.6.37?
From what was commented here and further looking, the sentence [...]
Upon receiving this trap, OpenSM initiates a heavy sweep,
Eli Cohen wrote:
If you create a MR in kernel, it covers the entire address space and
the HCA does not pose any limit since you do not consume MTTs. And if
you use MTTs then the page size is a parameter in this calculation -
huge page, regular page etc.
I agree that the kernel case is not of
Eli Cohen wrote:
We have successfully tested MPI, SDP, RDS, and native Verbs applications over
IBoE.
I came across your ofed commit e5414cccaa13e6dd80d8d6fc3dafe95355facdef sdp:
module parameter
to disable SDP over ROCEE and wasn't sure what's behind it, can you clarify
that?
Or.
--
To
Amir Vadai wrote:
It is from the days that SDP over RoCE wasn't stable. In addition, when
customers had a very long delay before TCP connection established, in the
following scenario:
1. in libsdp.conf, setting mode to 'both' (Try SDP and fallback to TCP)
2. application tries to open
Eli Cohen e...@dev.mellanox.co.il wrote:
Fix the limit of max fast regisreation WRs that can be posted to CX to match
hardware capabilities.
Guys, can you clarify if the hardware limitation is 511 entries or its
(PAGE_SIZE / sizeof(pointer)) - 1 which is 4096 / 8 - 1 = 511 but can
change if
Bart Van Assche bvanass...@acm.org wrote:
Has anyone been looking into this before ?
nope, never ever, what hca is that?
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at
Eli Cohen wrote:
Completions with non-zero (error) status and a wr_id / opcode
combination were received that were never queued by the application.
In case of error the opcode of the completed operation is not provided. I am
not sure why.
Eli, there's nothing in the IB spec that mandates the
Eli Cohen e...@dev.mellanox.co.il wrote:
[...] Address resolution is done atomically in the
case of a link local address or a multicast GID and otherwise -EINVAL is
returned. mlx4 transport packets were changed too to accommodate for IBoE.
Multicast groups attach/detach calls
Eli Cohen wrote:
On Sun, Oct 24, 2010 at 6:22 PM, Roland Dreier rdre...@cisco.com wrote:
No you did not. It was there already but we never noticed before Yossi's
patch.
But AFAICT Yossi's patch (5eb620c8) went into 2.6.22 about 2.5 years
ago... wasn't that already there way before the IBoE
I pulled/built/booted with the for-next branch of Roland's tree, and I can't
get IB link for the node,
I don't think this is my problem, since I'm on L2 IB and not Eth, but should
this work with pre 2.7
firmware?! if not, maybe patch the mlx4 driver to print some error,
okay, I verified
I pulled/built/booted with the for-next branch of Roland's tree, and I can't
get IB link for the node,
I don't think this is my problem, since I'm on L2 IB and not Eth, but should
this work with pre 2.7
firmware?! if not, maybe patch the mlx4 driver to print some error,
okay, I verified
On Mon, Oct 25, 2010 at 1:34 PM, Eli Cohen e...@dev.mellanox.co.il wrote:
IBoE will not work with firmware prior to 2.7.000. I don't think an
error message is required in this case.
But I'm on **IB** not IBoE, I don't think you mean that the Linux
kernel IB stack is not functional over pre-2.7
On Mon, Oct 25, 2010 at 6:17 PM, Eli Cohen e...@dev.mellanox.co.il wrote:
On Mon, Oct 25, 2010 at 06:36:43AM -0700, Roland Dreier wrote:
I suspect I broke either the UD header packing or the build_mlx_header
function when I cleaned up the patches. I see the same problem, I'll
take a look
On Mon, Oct 25, 2010 at 4:36 PM, Eli Cohen e...@dev.mellanox.co.il wrote:
Of course not. I just noticed that the IB link for IB link layer does
come up, is that what you're seeing?
No, I didn't have IB Link when I used the for-next bits
--
To unsubscribe from this list: send the line
On Mon, Oct 25, 2010 at 7:13 PM, Eli Cohen e...@dev.mellanox.co.il wrote:
On Mon, Oct 25, 2010 at 06:46:39PM +0200, Or Gerlitz wrote:
No, I didn't have IB Link when I used the for-next bits
Can you summarize what is the problem that you're seeing?
Eli, this is pretty simple, I do the following
Roland Dreier wrote:
Yep, looks like that's where my cleanup broke things. I rolled this in
and pushed it out; I'm testing it myself now.
My IB port comes to active now, I think that fixed things.
same here, I have IB port coming to active and basic IPoIB, opensm working okay
on the node
I have IB port coming to active and basic IPoIB, opensm working okay
on the node with the current for-next/IBoE bits
doing a little bit stress testing, I came across the below oops, when running
IPoIB
and couple of iperf/udp sessions, it doesn't look like a problem in the IB
stack.
Also with
doing a little bit stress testing, I came across the below oops, when running
IPoIB
and couple of iperf/udp sessions, it doesn't look like a problem in the IB
stack.
To trigger this I run from client node the following iperf -uc 192.168.21.18
-l 64000 -t 72000 -i 1 -b 40g -d -P 4 where
Hefty, Sean wrote:
[...] an alternative goal f these patches is to allow ibacm and similar
applications to detect and react to SA and CM timeouts.
Hi Sean,
As far as I understand CM timeout is an event not a mad... when
referring to detecting/reacting on CM timeouts, did you mean detecting
Aleksey Senin wrote:
The following patches add a new QP type named RAW_PACKET.
Is there anything different in this patch set compared to V1 of
https://patchwork.kernel.org/patch/110153 or its just a repost?
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body
Steve Wise wrote:
I'm working on similar code for Chelsio that will use these QPs.
Will the TX flow require going into kernel space or will be fully offloaded?
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More
Hi Eli, can this patch of yours which you placed in ofed be pushed
upstream? Or.
From 4237a1fbc1bae6bb86665f81cd93cfac37b216d2 Mon Sep 17 00:00:00 2001
From: Eli Cohen e...@mellanox.co.il
Date: Wed, 3 Nov 2010 10:56:38 +0200
Subject: [PATCH] IPoIB: Fix IPoIB to conform to ethtool definitions
Hi Eli, are there known IBoE fixes which are in ofed but missed 2.6.37-rc1?
Also, can the below and/or any other enhancements you've placed in ofed be
pushed upstream? it would be great to have perf counters operating fine for IBoE
Or.
From 72c316b60f62401e031520fe3f55ec6879bbc42b Mon Sep 17
Eli Cohen wrote:
Sure, I was going to. I will send later today.
I saw that you've dropped and implementation of inline/blue-flame sending
for kernel space, what was the motivation is it sdp, rds or alike or something
else?
Or.
--
To unsubscribe from this list: send the line unsubscribe
Eli Cohen wrote:
I was going to send [...] upstream
Also you had a fix to the port speed and something related to SL which I
didn't understand, please send for review
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to
Eli Cohen wrote:
The idea is to let kernel consumers enjoy the improvements to latency that blue
flame gives. And yes, SDP is motivating us but I am going to push to IPoIB too.
From my recollection of numbers, for user space apps, using inline
accounts for about 1us improvement in the
Or Gerlitz wrote:
Eli Cohen wrote:
From my recollection of numbers, for user space apps, using inline
accounts for about 1us improvement in the latency, if this is indeed the
case, I'm sure if there's great value here for kernel consumers, do you
have any numbers to support this patch?
I
Eli Cohen wrote:
It indeed improves SDP's latency - I don't have exact numbers.
the SDP number is very interesting (Amir, do you have it?) but irrelevant for
upstream, any IPoIB numbers?
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to
Hefty, Sean wrote:
CM mads aren't reliable, however they are retried. If a CM REQ does not
receive a response after so many retries (usually 15), the REQ fails (status is
timeout). The mad layer reports the timeout to the cm module. With snooping
in place, a user will be notified that a
Eli Cohen wrote:
For IPoIB it gives ~1 usec for improvement in latency.
yep, this is what I expected, so over your testbed from what value to what
value? also it
would be important to note the change in the cpu utilization (e.g few vmstat
1 output
lines before/after the change, while running
Hefty, Sean wrote:
That requires registration with the SA. The intent is to avoid using a
centralized service when possible.
yep, makes sense, look like this design finally went the decentralized way...
cool
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the
Usha Srinivasan wrote:
Can someone from Mellanox tell me what the vendor error 0x32 means? I am
getting this error for wc.opcode 128 (IB_WC_RECV) wc.status 4
(IB_WC_LOC_PROT_ERR). I am running ofed 1.5.2 and am getting it on both rhel5
and sles11
You can't count on the wc.opcode when the
Tom Tucker t...@ogc.us wrote:
This patch changes the bus mapping logic to avoid page_address() where
necessary
Hi Tom,
Does when necessary comes to say that invocations of page_address
which remained in the code after this patch was applied are safe and
no kmap call is needed?
Or.
--
To
Ralph Campbell wrote:
I guess what I'm objecting to is hard coding mlx4. I was trying to think of a
way that would allow other HCAs to support the block loopback option in the
future. It looks like ipoib sets IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK for
kernel QPs but this isn't defined in
Hefty, Sean wrote:
One could argue that this change is reasonable regardless of the OFED kernel
patch. It avoids sending multicast traffic when the destination is local. The
main drawback beyond the extra code is that a node can't send a multicast
message to itself, with the intent that
Hi Ido,
We came into a situation where running rdma_lat with vs with out the -c flag,
which means w. or w.o using the rdma-cm introduces a notable ~1us difference in
latency for 1k messages, that is ~3us w.o using rdma-cm and 3.9us when using
the rdma-cm.
I have reproduced that now with the
101 - 200 of 2157 matches
Mail list logo