Having a few schemes availabe in the core code that the driver can chose
from seems like a much more sensible option.
I think that makes sense, but several of the schemes we are working
with are effectively single-vendor schemes. Indirect MR and DIX are
good examples of things that only one
On 7/13/2015 11:15 PM, Jason Gunthorpe wrote:
On Mon, Jul 13, 2015 at 03:36:44PM -0400, Tom Talpey wrote:
On 7/11/2015 6:25 AM, 'Christoph Hellwig' wrote:
I think what we need to support for now is FRMR as the primary target,
and FMR as a secondar[y].
FMR is a *very* bad choice, for several
On 7/13/2015 7:30 PM, Jason Gunthorpe wrote:
On Fri, Jul 10, 2015 at 12:09:37PM +0300, Sagi Grimberg wrote:
Given the last discussions on our in-kernel memory registration API I
thought I'd propose another approach to address this.
I assume you can put your new indirect registrations under
On 7/14/2015 10:37 AM, 'Christoph Hellwig' wrote:
On Mon, Jul 13, 2015 at 03:36:44PM -0400, Tom Talpey wrote:
On 7/11/2015 6:25 AM, 'Christoph Hellwig' wrote:
I think what we need to support for now is FRMR as the primary target,
and FMR as a secondar[y].
FMR is a *very* bad choice, for
On 7/14/2015 10:25 AM, 'Christoph Hellwig' wrote:
On Mon, Jul 13, 2015 at 10:57:48AM -0600, Jason Gunthorpe wrote:
Currently various drivers are using ib_get_dma_mr with remote flags
unfortunately, e.g. the SRP initiator driver uses it to optimize away
memory registrtions for single SGL entry
On 7/13/2015 5:16 PM, Chuck Lever wrote:
NFS really should be using something more similar to a scatterlist,
as it maps pretty well to the sk_frags in the network layer as well.
Struct scatterlist is imprtant because it's the way the DMA mapping
functions takes a multi-page argument, so ayone
On 7/14/2015 3:12 PM, Tom Talpey wrote:
On 7/14/2015 5:22 AM, Sagi Grimberg wrote:
On 7/14/2015 10:37 AM, 'Christoph Hellwig' wrote:
On Mon, Jul 13, 2015 at 03:36:44PM -0400, Tom Talpey wrote:
On 7/11/2015 6:25 AM, 'Christoph Hellwig' wrote:
I think what we need to support for now is FRMR
On 7/14/2015 3:24 PM, Tom Talpey wrote:
On 7/14/2015 4:06 AM, Sagi Grimberg wrote:
All protocols cares about transferring data and sending messages, so
it's not a good enough reason for a poor registration method choice.
This just emphasizes why we need to converge to a single method.
In my
On 7/14/2015 6:33 PM, Christoph Hellwig wrote:
On Tue, Jul 14, 2015 at 11:39:24AM +0300, Sagi Grimberg wrote:
This is exactly what I don't want to do. I don't think that implicit
posting is a good idea for reasons that I mentioned earlier:
This is where I have a problem. Providing an API
I'm really disappointed by the negative emails on this subject..
Jason,
I'm really not trying to be negative. I'm hearing you out, and I agree
with a lot of what you have to say. I just don't agree with all of it.
You are right, ULPs do the same thing, the same wrong thing of
maintaining a
On 7/14/2015 7:35 PM, Jason Gunthorpe wrote:
On Tue, Jul 14, 2015 at 07:12:01PM +0300, Sagi Grimberg wrote:
The ULP doesn't care if it needs to reserver the slot, and it generally
doesn't care about the notification either unless it needs to handle an
error.
That's generally correct
On 7/16/2015 11:07 AM, Christoph Hellwig wrote:
On Thu, Jul 16, 2015 at 09:52:44AM +0300, Sagi Grimberg wrote:
I suggest to start with what I proposed. And in a later stage (if we
still think its needed) we can have a higher level API that hides the
post, something like:
rdma_reg_sg(struct
On 7/15/2015 5:32 PM, Chuck Lever wrote:
On Jul 15, 2015, at 4:01 AM, Sagi Grimberg sa...@dev.mellanox.co.il wrote:
On 7/14/2015 8:09 PM, Jason Gunthorpe wrote:
On Tue, Jul 14, 2015 at 07:55:39PM +0300, Sagi Grimberg wrote:
But, if people think that it's better to have an API that does
On 7/15/2015 8:07 PM, Jason Gunthorpe wrote:
On Wed, Jul 15, 2015 at 12:32:33AM -0700, Christoph Hellwig wrote:
int rdma_create_mr(struct ib_pd *pd, enum rdma_mr_type mr,
u32 max_pages, int flags);
* array from a SG list
* @mr: memory region
* @sg: sg
I can drop it, unless anyone can think of a use-case where a ULP would
want to register a region with a different offset from sg[0]-offset
and/or ends before the sum(sg-length).
What if the sg list has to be chunked up due to the device's FRWR pbl depth
limits? Or is that handled underneath
On 7/14/2015 8:09 PM, Jason Gunthorpe wrote:
On Tue, Jul 14, 2015 at 07:55:39PM +0300, Sagi Grimberg wrote:
But, if people think that it's better to have an API that does implicit
posting always without notification, and then silently consume error or
flush completions. I can try and look
On 7/14/2015 11:29 PM, Jason Gunthorpe wrote:
On Tue, Jul 14, 2015 at 12:55:11PM -0700, 'Christoph Hellwig' wrote:
On Tue, Jul 14, 2015 at 02:32:31PM -0500, Steve Wise wrote:
You mean should not, yea?
Ok. I'll check for iWARP. But don't tell me to remove the transport-specific
hacks in
On 7/15/2015 10:32 AM, Christoph Hellwig wrote:
Hi Sagi,
I went over your proposal based on reviewing the ongoing MR threads
and my implementation of a similar in-driver abstraction, so here
are some proposed updates.
struct provider_mr {
u64 *page_list; // or what ever
On 7/15/2015 6:05 AM, Doug Ledford wrote:
On 07/14/2015 01:08 PM, Jason Gunthorpe wrote:
On Tue, Jul 14, 2015 at 07:46:50PM +0300, Sagi Grimberg wrote:
Which drivers doesn't support FRWR that we need to do other things?
ipath - depracated
We have permission to move this to staging
On 7/14/2015 8:26 PM, Jason Gunthorpe wrote:
On Tue, Jul 14, 2015 at 12:05:53PM +0300, Sagi Grimberg wrote:
iser has it too. I have a similar patch with a flag for iser (its
behind a bulk of patches that are still pending though).
Do we all agree and understand that stuff like
On 7/15/2015 8:25 PM, Jens Axboe wrote:
On 07/15/2015 11:19 AM, Keith Busch wrote:
On Wed, 15 Jul 2015, Bart Van Assche wrote:
* With blk-mq and scsi-mq optimal performance can only be achieved if
the relationship between MSI-X vector and NUMA node does not change
over time. This is
/**
* ib_mr_set_sg() - populate memory region buffers
* array from a SG list
* @mr: memory region
* @sg: sg list
* @sg_nents:number of elements in the sg
*
* Can fail if the HW is not able to register this
* sg list. In case of failure - caller
On 7/16/2015 9:08 PM, Jason Gunthorpe wrote:
On Thu, Jul 16, 2015 at 03:21:04PM +0300, Sagi Grimberg wrote:
I gotta say,
these suggestions of bool/write or supported_ops with a convert helper
seem (to me at least) to make things more complicated.
Why not just set the the access_flags
On 7/16/2015 6:25 PM, Bart Van Assche wrote:
Hello,
Hi Bart,
I agree it would definitely help as the lack of immediate data
emphasizes the additional latency of doing rdma reads.
As you probably know for write requests immediate data means sending
the data in the same packet as the write
On 7/20/2015 12:43 AM, Or Gerlitz wrote:
On Sun, Jul 19, 2015 at 7:07 PM, Sagi Grimberg sa...@dev.mellanox.co.il wrote:
On 7/16/2015 6:25 PM, Bart Van Assche wrote:
I agree it would definitely help as the lack of immediate data
emphasizes the additional latency of doing rdma reads.
Sagi
On 7/11/2015 1:39 PM, Christoph Hellwig wrote:
On Fri, Jul 10, 2015 at 12:09:37PM +0300, Sagi Grimberg wrote:
And then provide helpers to populate the MR with generic kernel
structures such as struct scatterlist (for scsi and other ULPs),
struct page (for NFS) or struct bio_vec (for block ULPs
When accounting the needed_pages, we need to look into
the page_list-max_page_list_len and not the global
context xprt-sc_frmr_pg_list_len.
Signed-off-by: Sagi Grimberg sa...@mellanox.com
---
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |3 ++-
1 files changed, 2 insertions(+), 1 deletions
On 7/20/2015 8:13 PM, Chuck Lever wrote:
On Jul 20, 2015, at 1:00 PM, Sagi Grimberg sa...@mellanox.com wrote:
When accounting the needed_pages, we need to look into
the page_list-max_page_list_len and not the global
context xprt-sc_frmr_pg_list_len.
Signed-off-by: Sagi Grimberg sa
On 7/20/2015 7:23 PM, Jason Gunthorpe wrote:
On Sun, Jul 19, 2015 at 08:33:24AM +0300, Sagi Grimberg wrote:
I was thinking that the user won't explicitly say which key it registers
and it will be decided from the registration itself.
Meaning, the registration code will do:
Please don't
On 7/20/2015 8:00 PM, Jason Gunthorpe wrote:
On Mon, Jul 20, 2015 at 07:27:52PM +0300, Sagi Grimberg wrote:
I'm thinking now that this should have an input argument
of block_size. Maybe in the future ULPs would want to register
huge pages, it will be a shame to map it into PAGE_SIZE chunks
I'm thinking now that this should have an input argument
of block_size. Maybe in the future ULPs would want to register
huge pages, it will be a shame to map it into PAGE_SIZE chunks...
Why wouldn't it just transparently support huge pages? sg seems to
have enough information.
I'm not sure I
The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY
but does not set the the device local_dma_lkey. This breaks
rpcrdma drivers.
Query and set this lkey when creating the device resources.
Signed-off-by: Sagi Grimberg sa...@mellanox.com
---
drivers/infiniband/hw/mlx5/main.c
On 7/20/2015 8:08 PM, Chuck Lever wrote:
On Jul 20, 2015, at 12:54 PM, Sagi Grimberg sa...@mellanox.com wrote:
The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY
but does not set the the device local_dma_lkey. This breaks
rpcrdma drivers.
Query and set this lkey when creating
Should be all the page sizes that are supported by the
device.
Reported-by: Jason Gunthorpe jguntho...@obsidianresearch.com
Signed-off-by: Sagi Grimberg sa...@mellanox.com
---
drivers/infiniband/hw/mlx5/main.c |3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers
Bleh... seems like a great effort just to find that out. Isn't it
better to just ask for a page_size arg?
So who computes page_size and how? Don't just punt things to a caller
without really explaining how the caller is supposed to use it
correctly.
I'd imagine that the ULP knows when it
On 7/21/2015 3:03 AM, Bart Van Assche wrote:
On 07/19/2015 09:07 AM, Sagi Grimberg wrote:
On 7/16/2015 6:25 PM, Bart Van Assche wrote:
As you probably know for write requests immediate data means sending
the data in the same packet as the write command instead of sending it
as a separate
So you have 140% better IOPS with immediate-data vs. non immediate
data?! numberz?
No, the improvement was to avoid memory copy from the pre-posted recieve
buffer (with immediate-data) to an allocated buffer. Instead the receive
buffer is handed to the backend to do IO.
This shows up to 40%
On 10/21/2015 1:04 PM, Or Gerlitz wrote:
On 10/21/2015 12:53 PM, Sagi Grimberg wrote:
On 10/15/2015 2:44 PM, Eran Ben Elisha wrote:
+struct ib_uverbs_ex_create_qp {
+__u64 user_handle;
+__u32 pd_handle;
+__u32 send_cq_handle;
+__u32 recv_cq_handle;
+__u32 srq_handle
The driver now exposes sufficient limits so we can
avoid having mlx4 specific work-around.
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
drivers/infiniband/ulp/isert/ib_isert.c | 10 ++
1 files changed, 2 insertions(+), 8 deletions(-)
diff --git a/drivers/infiniband/ulp
This addresses a specific mlx4 issue where the max_sge_rd
is actually smaller than max_sge (rdma reads with max_sge
entries completes with error).
The second patch removes the explicit work-around from the
iser target code.
This applies on top of Christoph's device attributes modification.
Sagi
mlx4 devices (ConnectX-2, ConnectX-3) can not issue
max_sge in a single RDMA_READ request (resulting in
a completion error). Thus, expose lower max_sge_rd
to avoid this issue.
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
drivers/infiniband/hw/mlx4/main.c |3 ++-
1 files chan
The driver now exposes sufficient limits so we can
avoid having mlx4 specific work-around.
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
Reviewed-by: Steve Wise <sw...@opengridcomputing.com>
---
drivers/infiniband/ulp/isert/ib_isert.c | 13 +++--
1 files changed,
The driver does not support it anyway, and the support
should be added to a generic layer shared by both hfi1,
qib and softroce drivers.
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
drivers/staging/rdma/hfi1/keys.c | 55 -
drivers/staging/rdm
The driver does not support it anyway, and the support
should be added to a generic layer shared by both hfi1,
qib and softroce drivers.
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
drivers/staging/rdma/ipath/ipath_verbs.c |3 ---
drivers/staging/rdma/ipath/ipath_verbs.h |
I can provide a patch for hfi, anything else needed?
It breaks all of them in staging, not just hgi1. So, hfi1, amso1100,
ipath, and ehca.
hfi1: Does not support FRWR at all, there are just some copy-paste
sections that supposedly handle it - so I'll drop any sign of it from
the code.
Hi Yuval,
The title prefix should be IB/mlx4:
Expose max_fmr so it will be available to ULPs.
max_fmr is num_mpts minus reserved.
Signed-off-by: Yuval Shaia
---
drivers/infiniband/hw/mlx4/main.c |1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git
Hello Sagi,
Is this the same issue as what has been discussed in
http://www.spinics.net/lists/linux-rdma/msg21799.html ?
Looks like it.
I think this patch addresses this issue, but lets CC Eli
to comment if I'm missing something.
Thanks for digging this up...
Sagi.
--
To unsubscribe from
On 27/10/2015 16:39, Or Gerlitz wrote:
On 10/27/2015 11:40 AM, Sagi Grimberg wrote:
mlx4 devices (ConnectX-2, ConnectX-3) can not issue
max_sge in a single RDMA_READ request (resulting in
a completion error). Thus, expose lower max_sge_rd
to avoid this issue.
Sagi,
Hey Or,
Still
But AFAIR, the magic number was 28... how this goes hand in hand with
your findings?
mlx4 max_sge is 32, and isert does max_sge - 2 = 30.
So it always used 30... and I run it reliably with this for a while now.
This thing exists before I was involved so I might not be familiar with
all the
and added
a root cause analysis to patch change log.
- Fixed isert qp creation to be max_sge but construct rdma
work request with the minimum of max_sge and max_sge_rd
as non-rdma sends (login rsp) take 2 sges (and some devices
have max_sge_rd = 1.
Sagi Grimberg (2):
mlx4: Expose correct
The driver now exposes sufficient limits so we can
avoid having mlx4 specific work-around.
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
drivers/infiniband/ulp/isert/ib_isert.c | 13 +++--
1 files changed, 3 insertions(+), 10 deletions(-)
diff --git a/drivers/infiniba
= 30.
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
drivers/infiniband/hw/mlx4/main.c |2 +-
include/linux/mlx4/device.h | 11 +++
2 files changed, 12 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/main.c
b/drivers/infiniband/hw/mlx4/
Detected this by compiling with W=1.
Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Sagi Grimberg <sa...@mellanox.com>
FWIW,
Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma&qu
Did we converge on this?
Just a heads up to Doug, this conflicts with
[PATCH v4 11/16] xprtrdma: Pre-allocate Work Requests for backchannel
but it's trivial to sort out...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to
Hi Arnd,
Since we want to make counting semaphores go away,
Why do we want to make counting semaphores go away? completely?
or just for binary use cases?
I have a use case in iser target code where a counting semaphore is the
best suited synchronizing mechanism.
I have a single thread
Submitting a SCSI request through the SG_IO mechanism with a scatterlist
that is longer than what is supported by the SRP initiator triggers an
infinite loop. This patch series fixes that behavior.
The individual patches in this series are as follows:
0001-IB-srp-Fix-a-spelling-error.patch
Jason,
It is always acceptable to use a lkey MR instead of the local dma
lkey, but ULPs should prefer to use the local dma lkey if possible,
for performance reasons.
I don't necessarily agree with this statement (at least with the second
part of it), the world is not always perfect.
For RDMA
On 10/11/2015 15:41, Christoph Hellwig wrote:
FYI, this is the API I'd aim for (only SRP and no HW driver converted
yet):
This looks fine, although personally I find scope and direction flags
more confusing than access_flags (but maybe it's just me).
I think that the real issue here is the
On 11/11/2015 10:08, Christoph Hellwig wrote:
On Tue, Nov 10, 2015 at 11:01:56AM -0700, Jason Gunthorpe wrote:
No need to change every driver.
I'd suggest something like
unsigned int rdma_cap_rdma_read_mr_flags(const struct ib_pd *pd)
{
if (rdma_protocol_iwarp(pd->device,
I’d like to see our NFS server use the local DMA lkey where it
makes sense, to avoid the cost of registering and invalidating
memory.
I have to agree with Tom that once the device’s s/g limit is
exceeded, the server has to post an RDMA Read WR every few
pages, and appears to get expensive
On 11/11/2015 18:18, Christoph Hellwig wrote:
On Wed, Nov 11, 2015 at 08:03:46AM -0800, Bart Van Assche wrote:
Hello Christoph,
The SRP initiator from kernel 4.3 is working fine on my test setup. I will
start a test with Linus' tree and with the following SRP kernel module
parameters:
# cat
On 28/10/2015 13:28, Sagi Grimberg wrote:
This addresses a specific mlx4 issue where the max_sge_rd
is actually smaller than max_sge (rdma reads with max_sge
entries completes with error).
The second patch removes the explicit work-around from the
iser target code.
Changes from v1:
- Fixed
Hi Doug,
Kind reminder for picking this up for 4.4
Doug?
Are you planning to pick this up? Note that this patch
is stable material as well.
Doug? any plans for this patch?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to
Hello Hal,
With which SRP target has this behavior been observed ? Has this patch
been tested with the LIO SRP target ?
Hi Bart,
This issue was detected when testing a new array with SRP support.
This does not involve LIO as the Linux CM stack does not behave
in the way described in this
+
+struct ib_stop_cqe {
+ struct ib_cqe cqe;
+ struct completion done;
+};
+
+static void ib_stop_done(struct ib_cq *cq, struct ib_wc *wc)
+{
+ struct ib_stop_cqe *stop =
+ container_of(wc->wr_cqe, struct ib_stop_cqe, cqe);
+
+ complete(>done);
+}
+
+/*
+
On Fri, Nov 13, 2015 at 3:46 PM, Christoph Hellwig wrote:
The new name is irq_poll as iopoll is already taken. Better suggestions
welcome.
Sagi (or Christoph if you can address that),
@ some pointer over the last 18 months there was a port done at
mellanox for iser to use
On 15/11/2015 14:55, Christoph Hellwig wrote:
On Sun, Nov 15, 2015 at 11:40:02AM +0200, Sagi Grimberg wrote:
I doubt INT_MAX is useful as a budget in any use-case. it can easily
hog the CPU. If the consumer is given access to poll a CQ, it must be
able to provide some way to budget it. Why
On 15/11/2015 11:04, Or Gerlitz wrote:
On Sun, Nov 15, 2015 at 10:48 AM, Sagi Grimberg
<sa...@dev.mellanox.co.il> wrote:
Or is correct,
I have attempted to convert iser to use blk_iopoll in the past, however
I've seen inconsistent performance and latency skews (comparing to
tasklet
We should really get this properly map/unmap per IO at some point.
Probably do it in both code paths...
Having said that,
Looks fine,
Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord..
From: Jenny Derzhavetz <jen...@mellanox.com>
When all the task data is sent as immeidatedata, we are
allowed to use the local_dma_lkey as it is not sent to
the wire.
Signed-off-by: Jenny Derzhavetz <jen...@mellanox.com>
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
From: Roi Dayan <r...@mellanox.com>
destroy workqueue on transport register error
release kmem cache on workqueue alloc error
Signed-off-by: Roi Dayan <r...@mellanox.com>
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
drivers/infiniband/ulp/iser/iscsi_iser.c | 9 ++-
sponse) completion.
Signed-off-by: Jenny Derzhavetz <jen...@mellanox.com>
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
drivers/infiniband/ulp/iser/iscsi_iser.h | 3 +-
drivers/infiniband/ulp/iser/iser_initiator.c | 55 +++-
drivers/infiniband/ulp
We don't need iser_proto.h anymore, remove it and
move (non-protocol) declarations to ib_isert.h
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
Signed-off-by: Jenny Derzhavetz <jen...@mellanox.com>
---
drivers/infiniband/ulp/isert/ib_isert.c| 1 -
drivers/infiniband/ulp/iser
From: Jenny Derzhavetz <jen...@mellanox.com>
iser target does not support zero based virtual addresses and
send with invalidate, so it should declare that it doesn't.
Signed-off-by: Jenny Derzhavetz <jen...@mellanox.com>
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
With remote invalidate we won't local invalidate
but we still want to increment the rkey.
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
Signed-off-by: Jenny Derzhavetz <jen...@mellanox.com>
---
drivers/infiniband/ulp/iser/iser_memory.c | 20 ++--
1 file changed, 1
ise.
Signed-off-by: Jenny Derzhavetz <jen...@mellanox.com>
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
drivers/infiniband/ulp/iser/iser_memory.c | 8
drivers/infiniband/ulp/iser/iser_verbs.c | 4 ++--
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a
The iser RDMA_CM negotiation protocol is shared by
the initiator and the target, so have a shared header
for the defines and structure. Move relevant items from
the initiator and target headers.
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
Signed-off-by: Jenny Derzhavetz <jen...@mel
sponse.
Signed-off-by: Jenny Derzhavetz <jen...@mellanox.com>
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
drivers/infiniband/ulp/isert/ib_isert.c | 39 +++--
drivers/infiniband/ulp/isert/ib_isert.h | 2 ++
2 files changed, 34 insertions(+), 7
Remove the unused ib_allow_mw and ib_bind_mw functions, remove the
unused IB_WR_BIND_MW and IB_WC_BIND_MW opcodes and move ib_dealloc_mw
into the uverbs module.
Signed-off-by: Christoph Hellwig
Will the user-space drivers posting via uverbs (qib, hfi, rxe) need the
post_send
On 16/11/2015 19:02, Christoph Hellwig wrote:
On Mon, Nov 16, 2015 at 07:00:06PM +0200, Sagi Grimberg wrote:
Remove the unused ib_allow_mw and ib_bind_mw functions, remove the
unused IB_WR_BIND_MW and IB_WC_BIND_MW opcodes and move ib_dealloc_mw
into the uverbs module.
Signed-off
After looking at the nes driver, I don't see any common way to support drain
w/o some serious driver mods. Since SRP is the only
user, perhaps we can ignore iWARP for this function...
But iser/isert essentially does it too (and I think xprtrdma will have
it soon)...
the modify_qp is
On 15/11/2015 23:10, Or Gerlitz wrote:
On Sun, Nov 15, 2015, Sagi Grimberg <sa...@dev.mellanox.co.il> wrote:
On 15/11/2015 19:59, Christoph Hellwig wrote:
Without this sg_dma_len will return 0 on architectures tha have
the dma_length field.
and what wrong with that?
Becaus
On 10/11/2015 14:28, Sagi Grimberg wrote:
Hi Yann,
Why were those hw providers not modified to
enforce IB_ACCESS_REMOTE_WRITE when needed, instead of asking users to
set it for them ?
Do you mean that ULPs will set IB_ACCESS_LOCAL_WRITE and
iWARP providers executing the memory
Sagi, the Windows NDKPI has an NDK_MR_FLAG_RDMA_READ_SINK attribute
which the upper layer can use to convey this information, I've mentioned
it here before.
https://msdn.microsoft.com/en-us/library/windows/hardware/hh439908(v=vs.85).aspx
Thanks for the tip Tom.
When this approach is used,
Why? the invalidate is just one part of the story, we are doing a
mapping on IO submission
and CX3 has strong ordering on FRWRs, right?
Yes, this is correct.
We'll test on CX3 to see if this introduces a regression.
We should make sure not to introduce performance regression for HW which
On 10/11/2015 13:38, Christoph Hellwig wrote:
On Tue, Nov 10, 2015 at 12:44:14PM +0200, Sagi Grimberg wrote:
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -238,7 +238,7 @@ int rdma_read_chunk_frmr(struct svcxprt_rdma *xprt,
read = min_t
Instead of hard-coding remote access (which is not secured
issue in IB).
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
b/net/
On 10/11/2015 13:41, Christoph Hellwig wrote:
Oh, and while we're at it. Can someone explain why we're even
using rdma_read_chunk_frmr for IB? It seems to work around the
fact tat iWarp only allow a single RDMA READ SGE, but it's used
whenever the device has IB_DEVICE_MEM_MGT_EXTENSIONS,
Looks reasonable, although currently this code is only used for iWarp
anyway.
I know... I'm hoping this will change at some point, and when it does,
it will get it right hopefully.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to
From all I can tell nes also is a iWarp driver.
It is.. I don't know why I treated it as IB :)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
attributes merge into
struct ib_device.
Sagi Grimberg (3):
IB/core: Expose a device attribute for rdma_read access flags
svcrdma: Use device rdma_read_access_flags
RDS_IW: Use device rdma_read_access_flags
drivers/infiniband/hw/cxgb3/iwch_provider.c | 2 ++
drivers/infiniband/hw/cxgb4
Signed-off-by: Sagi Grimberg <sa...@mellanox.com>
---
drivers/infiniband/hw/cxgb3/iwch_provider.c | 2 ++
drivers/infiniband/hw/cxgb4/provider.c | 2 ++
drivers/infiniband/hw/mlx4/main.c| 1 +
drivers/infiniband/hw/mlx5/main.c| 1 +
drivers/infiniband/hw
FYI, I've updated the git branch to be based on current linus' tree
which required a few bit to be fixed. I'd also like to note that while
everyone but Or seemed to be generally fine with it I'd really prefer
and actualy revivewed-by or acked-by tag.
You can add:
Tested-by: Sagi Grimberg
Hi Yann,
Why were those hw providers not modified to
enforce IB_ACCESS_REMOTE_WRITE when needed, instead of asking users to
set it for them ?
Do you mean that ULPs will set IB_ACCESS_LOCAL_WRITE and
iWARP providers executing the memory registration will add
IB_ACCESS_REMOTE_WRITE? That's
which must support FRs to comply
+* to the iWarp verbs spec. iWarp devices also support the
+* IB_WR_RDMA_READ_WITH_INV verb for RDMA READs that invalidate the
+* stag.
+*/
Kinda weird that READ_WITH_INV came in without a device cap for it.
Looks good,
Reviewe
Looks good,
Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Looks good,
Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Looks good,
Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/11/2015 20:56, Bart Van Assche wrote:
On 11/03/2015 09:44 AM, Sagi Grimberg wrote:
Can you spare a few words on this change in the change log?
Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Sagi Grimberg <sa...@mellanox.com>
Cc: Sebastian Parschauer &l
On 15/10/2015 12:26, Sagi Grimberg wrote:
When using work request based memory registration (fast_reg)
we must reserve SQ entries for registration and invalidation
in addition to send operations. Each IO consumes 3 SQ entries
(registration, send, invalidation) so we need to allocate 3x
larger
701 - 800 of 1199 matches
Mail list logo