On 7/20/2015 12:43 AM, Or Gerlitz wrote:
On Sun, Jul 19, 2015 at 7:07 PM, Sagi Grimberg sa...@dev.mellanox.co.il wrote:
On 7/16/2015 6:25 PM, Bart Van Assche wrote:
I agree it would definitely help as the lack of immediate data
emphasizes the additional latency of doing rdma reads.
Sagi,
Hi Doug-
Are you comfortable ack’ing this one? If so, I can carry it in my
nfs-rdma-for-4.3 series following the patch that removes the last
ib_reg_phys_mr() call site.
On Jul 14, 2015, at 4:11 PM, Chuck Lever chuck.le...@oracle.com wrote:
The verbs are obsolete. The ib_rereg_phys_mr() verb
RDMA_NOMSG type calls are less efficient than RDMA_MSG. Count NOMSG
calls so administrators can tell if they happen to be used more than
expected.
Signed-off-by: Chuck Lever chuck.le...@oracle.com
Tested-by: Devesh Sharma devesh.sha...@avagotech.com
---
net/sunrpc/xprtrdma/rpc_rdma.c |1 +
Currently xprtrdma appends an extra chunk element to the RPC/RDMA
read chunk list of each NFSv4 WRITE compound. The extra element
contains the final GETATTR operation in the compound.
The result is an extra RDMA READ operation to transfer a very short
piece of each NFS WRITE compound (typically
Currently Linux always offers a reply chunk, even when the reply
can be sent inline (ie. is smaller than 1KB).
On the client, registering a memory region can be expensive. A
server may choose not to use the reply chunk, wasting the cost of
the registration.
This is a change only for RPC replies
The verbs are obsolete. The ib_rereg_phys_mr() verb is not used by
kernel ULPs, and the last ib_reg_phys_mr() call site in the kernel
tree has now been removed.
Two staging tree call sites remain in the Lustre client. The Lustre
team has been notified of the deprecation of reg_phys_mr.
When accounting the needed_pages, we need to look into
the page_list-max_page_list_len and not the global
context xprt-sc_frmr_pg_list_len.
Signed-off-by: Sagi Grimberg sa...@mellanox.com
---
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
On Jul 20, 2015, at 12:54 PM, Sagi Grimberg sa...@mellanox.com wrote:
The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY
but does not set the the device local_dma_lkey. This breaks
rpcrdma drivers.
Query and set this lkey when creating the device resources.
Wow. This
In particular, recognize when an IPv6 connection is bound.
Signed-off-by: Chuck Lever chuck.le...@oracle.com
Tested-by: Devesh Sharma devesh.sha...@avagotech.com
---
net/sunrpc/xprtrdma/transport.c | 28 +++-
1 file changed, 11 insertions(+), 17 deletions(-)
diff --git
Several important client-side performance and scalability
improvements are made in this series, proposed for the 4.3
kernel, including:
- Increase maximum RPC/RDMA credits to 128
- Increase maximum NFS/RDMA r/wsize to one megabyte
- Prefer inline rather than reply chunk replies
And these fixes:
The point of larger rsize and wsize is to reduce the per-byte cost
of memory registration and deregistration. Modern HCAs can typically
handle a megabyte or more with a single registration operation.
Signed-off-by: Chuck Lever chuck.le...@oracle.com
Reviewed-by: Devesh Sharma
Repair how rpcrdma_marshal_req() chooses which RDMA message type
to use for large non-WRITE operations so that it picks RDMA_NOMSG
in the correct situations, and sets up the marshaling logic to
SEND only the RPC/RDMA header.
Large NFSv2 SYMLINK requests now use RDMA_NOMSG calls. The Linux NFS
checkpatch.pl complained about the seq_printf() format string split
across lines and the use of %Lu.
Signed-off-by: Chuck Lever chuck.le...@oracle.com
Tested-by: Devesh Sharma devesh.sha...@avagotech.com
---
net/sunrpc/xprtrdma/transport.c | 48 +++
1 file
On Jul 20, 2015, at 2:31 PM, Chuck Lever chuck.le...@oracle.com wrote:
Hi Doug-
Are you comfortable ack’ing this one? If so, I can carry it in my
nfs-rdma-for-4.3 series following the patch that removes the last
ib_reg_phys_mr() call site.
Yes.
Acked-by: Doug Ledford
Untangle the end of rpcrdma_ia_open() by moving DMA MR set-up, which
is different for each registration method, to the .ro_open functions.
This is refactoring only. No behavior change is expected.
Signed-off-by: Chuck Lever chuck.le...@oracle.com
Tested-by: Devesh Sharma
When the size of the RPC message is near the inline threshold (1KB),
the client would allow messages to be sent that were a few bytes too
large.
When marshaling RPC/RDMA requests, ensure the combined size of
RPC/RDMA header and RPC header do not exceed the inline threshold.
Endpoints typically
RDMA_MSGP type calls insert a zero pad in the middle of the RPC
message to align the RPC request's data payload to the server's
alignment preferences. A server can then page flip the payload
into place to avoid a data copy in certain circumstances. However:
1. The client has to have a priori
All HCA providers have an ib_get_dma_mr() verb. Thus
rpcrdma_ia_open() will either grab the device's local_dma_key if one
is available, or it will call ib_get_dma_mr() which is a 100%
guaranteed fallback.
There is never any need to use the ib_reg_phys_mr() code path in
PHYSICAL memory registration uses a single rkey for all of the
client's memory, thus is insecure. It is still useful in some cases
for testing.
Retain the ability to select PHYSICAL memory registration capability
via /proc/sys/sunrpc/rdma_memreg_strategy, but don't fall back to it
if the HCA does
The client has been setting up a reply chunk for NFS READs that are
smaller than the inline threshold. This is not efficient: both the
server and client CPUs have to copy the reply's data payload into
and out of the memory region that is then transferred via RDMA.
Using the write list, the data
In preparation for similar increases on NFS/RDMA servers, bump the
advertised credit limit for RPC/RDMA to 128. This allocates some
extra resources, but the client will continue to allow only the
number of RPCs in flight that the server requests via its advertised
credit limit.
Signed-off-by:
On Jul 17, 2015, at 5:07 PM, Luis R. Rodriguez mcg...@do-not-panic.com
wrote:
From: Luis R. Rodriguez mcg...@suse.com
WARN() may confuse users, fix that. ipath_init_one() is part the
device's probe so this would only be triggered if a corresponding
device was found.
Signed-off-by:
On Sun, Jul 19, 2015 at 08:33:24AM +0300, Sagi Grimberg wrote:
I was thinking that the user won't explicitly say which key it registers
and it will be decided from the registration itself.
Meaning, the registration code will do:
Please don't..
if (access | (IB_ACCESS_REMOTE_READ |
On 7/20/2015 8:13 PM, Chuck Lever wrote:
On Jul 20, 2015, at 1:00 PM, Sagi Grimberg sa...@mellanox.com wrote:
When accounting the needed_pages, we need to look into
the page_list-max_page_list_len and not the global
context xprt-sc_frmr_pg_list_len.
Signed-off-by: Sagi Grimberg
-Original Message-
From: linux-nfs-ow...@vger.kernel.org
[mailto:linux-nfs-ow...@vger.kernel.org] On Behalf Of Sagi Grimberg
Sent: Monday, July 20, 2015 12:00 PM
To: linux-...@vger.kernel.org; linux-rdma@vger.kernel.org
Cc: Chuck Lever; Steve Wise
Subject: [PATCH RFC] svcrdma:
On Mon, Jul 20, 2015 at 07:27:52PM +0300, Sagi Grimberg wrote:
I'm thinking now that this should have an input argument
of block_size. Maybe in the future ULPs would want to register
huge pages, it will be a shame to map it into PAGE_SIZE chunks...
Why wouldn't it just transparently support
On Sun, Jul 19, 2015 at 08:45:26AM +0300, Sagi Grimberg wrote:
/**
* ib_mr_set_sg() - populate memory region buffers
* array from a SG list
* @mr: memory region
* @sg: sg list
* @sg_nents:number of elements in the sg
*
* Can fail if the HW
On 7/20/2015 7:23 PM, Jason Gunthorpe wrote:
On Sun, Jul 19, 2015 at 08:33:24AM +0300, Sagi Grimberg wrote:
I was thinking that the user won't explicitly say which key it registers
and it will be decided from the registration itself.
Meaning, the registration code will do:
Please don't..
if
On 7/20/2015 8:00 PM, Jason Gunthorpe wrote:
On Mon, Jul 20, 2015 at 07:27:52PM +0300, Sagi Grimberg wrote:
I'm thinking now that this should have an input argument
of block_size. Maybe in the future ULPs would want to register
huge pages, it will be a shame to map it into PAGE_SIZE chunks...
On Jul 20, 2015, at 1:00 PM, Sagi Grimberg sa...@mellanox.com wrote:
When accounting the needed_pages, we need to look into
the page_list-max_page_list_len and not the global
context xprt-sc_frmr_pg_list_len.
Signed-off-by: Sagi Grimberg sa...@mellanox.com
---
On 7/20/2015 12:44 PM, Sagi Grimberg wrote:
On 7/20/2015 12:43 AM, Or Gerlitz wrote:
On Sun, Jul 19, 2015 at 7:07 PM, Sagi Grimberg
sa...@dev.mellanox.co.il wrote:
On 7/16/2015 6:25 PM, Bart Van Assche wrote:
I agree it would definitely help as the lack of immediate data
emphasizes the
I'm thinking now that this should have an input argument
of block_size. Maybe in the future ULPs would want to register
huge pages, it will be a shame to map it into PAGE_SIZE chunks...
Why wouldn't it just transparently support huge pages? sg seems to
have enough information.
I'm not sure I
On Thu, Jul 16, 2015 at 09:25:37AM +0300, Or Gerlitz wrote:
On 7/14/2015 11:28 PM, Alex Thorlton wrote:
We see the same exact messages on 4.1-rc8.
does this solves the problem?
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index ad31e47..c8ae3b9 100644
---
Hi Alexey,
You are right and the flash code is with 0[xX] and not starting with xX and
p[0] checks are useless on marked line. This is from early development and did
not get changed.
Thanks
Faisal
-Original Message-
From: Alexey Dobriyan [mailto:adobri...@gmail.com]
Sent: Friday, July
The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY
but does not set the the device local_dma_lkey. This breaks
rpcrdma drivers.
Query and set this lkey when creating the device resources.
Signed-off-by: Sagi Grimberg sa...@mellanox.com
---
drivers/infiniband/hw/mlx5/main.c
On 7/20/2015 8:08 PM, Chuck Lever wrote:
On Jul 20, 2015, at 12:54 PM, Sagi Grimberg sa...@mellanox.com wrote:
The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY
but does not set the the device local_dma_lkey. This breaks
rpcrdma drivers.
Query and set this lkey when creating
-Original Message-
From: Steve Wise [mailto:sw...@opengridcomputing.com]
Sent: Monday, July 20, 2015 4:34 PM
To: 'Jason Gunthorpe'; 'Tom Talpey'
Cc: 'Chuck Lever'; 'linux-rdma@vger.kernel.org'; 'linux-...@vger.kernel.org'
Subject: RE: [PATCH v3 05/15] xprtrdma: Remove last
On 7/20/2015 1:55 PM, Chuck Lever wrote:
On Jul 20, 2015, at 4:34 PM, Tom Talpey t...@talpey.com wrote:
On 7/20/2015 12:03 PM, Chuck Lever wrote:
All HCA providers have an ib_get_dma_mr() verb. Thus
rpcrdma_ia_open() will either grab the device's local_dma_key if one
is available, or it will
On Jul 20, 2015, at 5:55 PM, Tom Talpey t...@talpey.com wrote:
On 7/20/2015 1:55 PM, Chuck Lever wrote:
On Jul 20, 2015, at 4:34 PM, Tom Talpey t...@talpey.com wrote:
On 7/20/2015 12:03 PM, Chuck Lever wrote:
All HCA providers have an ib_get_dma_mr() verb. Thus
rpcrdma_ia_open() will
Hi Tom-
On Jul 20, 2015, at 4:34 PM, Tom Talpey t...@talpey.com wrote:
On 7/20/2015 12:03 PM, Chuck Lever wrote:
All HCA providers have an ib_get_dma_mr() verb. Thus
rpcrdma_ia_open() will either grab the device's local_dma_key if one
is available, or it will call ib_get_dma_mr() which is a
-Original Message-
From: linux-nfs-ow...@vger.kernel.org
[mailto:linux-nfs-ow...@vger.kernel.org] On Behalf Of Jason Gunthorpe
Sent: Monday, July 20, 2015 4:06 PM
To: Tom Talpey; Steve Wise
Cc: Chuck Lever; linux-rdma@vger.kernel.org; linux-...@vger.kernel.org
Subject: Re: [PATCH
On 7/20/2015 2:16 PM, Steve Wise wrote:
-Original Message-
From: linux-nfs-ow...@vger.kernel.org [mailto:linux-nfs-ow...@vger.kernel.org]
On Behalf Of Jason Gunthorpe
Sent: Monday, July 20, 2015 4:06 PM
To: Tom Talpey; Steve Wise
Cc: Chuck Lever; linux-rdma@vger.kernel.org;
On Mon, Jul 20, 2015 at 01:34:16PM -0700, Tom Talpey wrote:
On 7/20/2015 12:03 PM, Chuck Lever wrote:
All HCA providers have an ib_get_dma_mr() verb. Thus
rpcrdma_ia_open() will either grab the device's local_dma_key if one
is available, or it will call ib_get_dma_mr() which is a 100%
Based on that, should we remove the cxgb3 driver as well? Or at least
can you fix it up to at least fail get_dma_mr if there is too much
ram?
I would like to keep cxgb3 around. I can add code to fail if the memory is
32b. Do you know how I get the amount of available
ram?
On Mon, Jul 20, 2015 at 04:37:15PM -0500, Steve Wise wrote:
From: Steve Wise [mailto:sw...@opengridcomputing.com]
Sent: Monday, July 20, 2015 4:34 PM
To: 'Jason Gunthorpe'; 'Tom Talpey'
Cc: 'Chuck Lever'; 'linux-rdma@vger.kernel.org'; 'linux-...@vger.kernel.org'
Subject: RE: [PATCH
On Mon, Jul 20, 2015 at 03:04:21PM -0700, Tom Talpey wrote:
B) why bother to check? Are machines with 4GB interesting, and worth
supporting a special optimization?
mainline drivers shouldn't silently malfunction.
Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
On Mon, Jul 20, 2015 at 08:07:00PM +0300, Sagi Grimberg wrote:
On 7/20/2015 8:00 PM, Jason Gunthorpe wrote:
On Mon, Jul 20, 2015 at 07:27:52PM +0300, Sagi Grimberg wrote:
I'm thinking now that this should have an input argument
of block_size. Maybe in the future ULPs would want to register
On Mon, Jul 20, 2015 at 08:09:57PM +0300, Sagi Grimberg wrote:
On 7/20/2015 8:08 PM, Chuck Lever wrote:
On Jul 20, 2015, at 12:54 PM, Sagi Grimberg sa...@mellanox.com wrote:
The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY
but does not set the the device local_dma_lkey.
On 7/20/2015 12:03 PM, Chuck Lever wrote:
All HCA providers have an ib_get_dma_mr() verb. Thus
rpcrdma_ia_open() will either grab the device's local_dma_key if one
is available, or it will call ib_get_dma_mr() which is a 100%
guaranteed fallback.
I recall that in the past, some providers did
On 7/20/2015 3:17 PM, Jason Gunthorpe wrote:
On Mon, Jul 20, 2015 at 03:04:21PM -0700, Tom Talpey wrote:
B) why bother to check? Are machines with 4GB interesting, and worth
supporting a special optimization?
mainline drivers shouldn't silently malfunction.
I meant why bother to check, just
-Original Message-
From: Tom Talpey [mailto:t...@talpey.com]
Sent: Monday, July 20, 2015 5:04 PM
To: Steve Wise; 'Jason Gunthorpe'
Cc: 'Chuck Lever'; linux-rdma@vger.kernel.org; linux-...@vger.kernel.org
Subject: Re: [PATCH v3 05/15] xprtrdma: Remove last ib_reg_phys_mr() call site
-Original Message-
From: Jason Gunthorpe [mailto:jguntho...@obsidianresearch.com]
Sent: Monday, July 20, 2015 5:54 PM
To: Steve Wise
Cc: 'Tom Talpey'; 'Chuck Lever'; linux-rdma@vger.kernel.org;
linux-...@vger.kernel.org
Subject: Re: [PATCH v3 05/15] xprtrdma: Remove last
On Mon, Jul 20, 2015 at 05:43:34PM -0500, Steve Wise wrote:
Yah, that seems much better.. With the patch set I am working on this
will mean all ULPs will fail to create a kernel PD on cxgb3 if the
above triggers. If our error handling works that should just make it
unuable from kernel
On Jul 20, 2015, at 6:41 PM, Jason Gunthorpe jguntho...@obsidianresearch.com
wrote:
On Mon, Jul 20, 2015 at 06:31:11PM -0400, Chuck Lever wrote:
On Jul 20, 2015, at 6:26 PM, Jason Gunthorpe
jguntho...@obsidianresearch.com wrote:
On Mon, Jul 20, 2015 at 03:03:11PM -0400, Chuck Lever
On 07/19/2015 02:25 PM, Vasiliy Tolstov wrote:
On 7/16/2015 6:25 PM, Bart Van Assche wrote:
it is easy to add to the SRP initiator and target drivers.
Implementations exist in the ib_srp-backport initiator driver and the
SCST SRP target driver (see also
On Jul 20, 2015, at 8:11 PM, Tom Talpey t...@talpey.com wrote:
On 7/20/2015 4:36 PM, Chuck Lever wrote:
On Jul 20, 2015, at 6:41 PM, Jason Gunthorpe
jguntho...@obsidianresearch.com wrote:
On Mon, Jul 20, 2015 at 06:31:11PM -0400, Chuck Lever wrote:
On Jul 20, 2015, at 6:26 PM, Jason
On 7/20/2015 5:34 PM, Chuck Lever wrote:
On Jul 20, 2015, at 8:11 PM, Tom Talpey t...@talpey.com wrote:
On 7/20/2015 4:36 PM, Chuck Lever wrote:
On Jul 20, 2015, at 6:41 PM, Jason Gunthorpe jguntho...@obsidianresearch.com
wrote:
On Mon, Jul 20, 2015 at 06:31:11PM -0400, Chuck Lever
On Mon, Jul 20, 2015 at 03:03:11PM -0400, Chuck Lever wrote:
+ iov-length = size;
+ iov-lkey = ia-ri_have_dma_lkey ?
+ ia-ri_dma_lkey : ia-ri_bind_mem-lkey;
+ rb-rg_size = size;
+ rb-rg_owner = NULL;
return rb;
There is something odd looking
On Jul 20, 2015, at 6:26 PM, Jason Gunthorpe jguntho...@obsidianresearch.com
wrote:
On Mon, Jul 20, 2015 at 03:03:11PM -0400, Chuck Lever wrote:
+iov-length = size;
+iov-lkey = ia-ri_have_dma_lkey ?
+ia-ri_dma_lkey : ia-ri_bind_mem-lkey;
+rb-rg_size
-Original Message-
From: Jason Gunthorpe [mailto:jguntho...@obsidianresearch.com]
Sent: Monday, July 20, 2015 5:14 PM
To: Steve Wise
Cc: 'Tom Talpey'; 'Chuck Lever'; linux-rdma@vger.kernel.org;
linux-...@vger.kernel.org
Subject: Re: [PATCH v3 05/15] xprtrdma: Remove last
On Mon, Jul 20, 2015 at 06:31:11PM -0400, Chuck Lever wrote:
On Jul 20, 2015, at 6:26 PM, Jason Gunthorpe
jguntho...@obsidianresearch.com wrote:
On Mon, Jul 20, 2015 at 03:03:11PM -0400, Chuck Lever wrote:
+ iov-length = size;
+ iov-lkey = ia-ri_have_dma_lkey ?
+
On Mon, Jul 20, 2015 at 05:41:27PM -0500, Steve Wise wrote:
B) why bother to check? Are machines with 4GB interesting, and worth
supporting a special optimization?
No, but cxgb3 is still interesting to user applications, and perhaps NFSRDMA
using FRMRs.
Doesn't look like the NFS client
On 7/20/2015 3:21 PM, Chuck Lever wrote:
On Jul 20, 2015, at 5:55 PM, Tom Talpey t...@talpey.com wrote:
On 7/20/2015 1:55 PM, Chuck Lever wrote:
On Jul 20, 2015, at 4:34 PM, Tom Talpey t...@talpey.com wrote:
On 7/20/2015 12:03 PM, Chuck Lever wrote:
All HCA providers have an
On 7/20/2015 4:36 PM, Chuck Lever wrote:
On Jul 20, 2015, at 6:41 PM, Jason Gunthorpe jguntho...@obsidianresearch.com
wrote:
On Mon, Jul 20, 2015 at 06:31:11PM -0400, Chuck Lever wrote:
On Jul 20, 2015, at 6:26 PM, Jason Gunthorpe jguntho...@obsidianresearch.com
wrote:
On Mon, Jul 20,
On 7/20/2015 3:41 PM, Steve Wise wrote:
-Original Message-
From: Tom Talpey [mailto:t...@talpey.com]
Sent: Monday, July 20, 2015 5:04 PM
To: Steve Wise; 'Jason Gunthorpe'
Cc: 'Chuck Lever'; linux-rdma@vger.kernel.org; linux-...@vger.kernel.org
Subject: Re: [PATCH v3 05/15] xprtrdma:
On 07/19/2015 09:07 AM, Sagi Grimberg wrote:
On 7/16/2015 6:25 PM, Bart Van Assche wrote:
As you probably know for write requests immediate data means sending
the data in the same packet as the write command instead of sending it
as a separate packet. This approach improves performance and
On Mon, Jul 20, 2015 at 11:28:03AM -0500, Alex Thorlton wrote:
I've got some time on the large machine later today. I'll give this a
try then.
I ran a boot with this patch applied:
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 83e80ab..c84aea0 100644
---
On Jul 16, 2015, at 3:52 AM, Devesh Sharma devesh.sha...@avagotech.com
wrote:
We have received appropriate permissions from the code authors and
would like to resubmit the patches to change to a dual-licensed
driver.
Thank-you.
Please resubmit your patch. Include nothing but the
68 matches
Mail list logo