[GIT PULL] pull request: linux-firmware: Add Intel OPA hfi1 firmware
Doug, Please Pull: git://github.com/weiny2/linux-firmware.git master-hfi1-firmware This is the first release of the Intel OPA hfi1 firmware required by the hfi1 driver. Thank you, Ira Weiny -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH REPOST libibverbs] Add IP and TCP/UDP TX checksum offload support
On Thu, Jun 18, 2015 at 7:38 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: On Sun, Jun 14, 2015 at 01:13:04PM +0300, Or Gerlitz wrote: From: Moshe Lazer mos...@mellanox.com Add a device capability flag IB_DEVICE_IP_SUM to denote checksum offload support. Devices should set this flag if they support insertion of IP, TCP and UDP checksums on outgoing IP packets sent over IB UD or ETH RAW_PACKET QPs. It would be really nice to see the man page discuss exactly what is done here, there are quite a few different options for checksum. IPoIB maps this to 'NETIF_F_IP_CSUM', so only UDP/TCP checksum in IPv4. What value should be placed in the checksum header(s) prior to issuing the send? It looks like IPoIB uses the CHECKSUM_PARTIAL protocol ? yes, we can follow here on how IPoIB uses this device capability. In that case, AFAIK and from quick look on the TX section for CHECKSUM_PARTIAL in include/linux/skbuff.h -- nothing special is assumed to be placed in the checksum field at the header/s prior to issuing the send, agree? Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next V2 4/4] IB/core: RoCE GID management separate cleanup and release
On Thu, Aug 06, 2015 at 08:14:00PM +0300, Matan Barak wrote: Separate the cleanup and release IB cache functions. The cleanup function delete all GIDs and unregister the event channel, while the release function frees all memory. The cleanup function is called after all clients were removed (in the device un-registration process). I'm finding this patches ontop of patches rebase stuff to be hard to follow, but this looks like the right general idea. Hopefully Doug can check the net result carefully.. Jason -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix possible use-after-free
With which kernel version has this been observed ? This was actually observed in RHEL 7.1 kernel (I think). But given its not easy to reproduce and the same code path exists in upstream, I thought I'd send it to you for review. scsi_remove_host() waits until all outstanding requests have finished. srp_free_ch_ib() is called either before a SCSI host is registered with the SCSI core or after scsi_remove_host() has finished. So I don't see how the above call trace could be triggered with a recent kernel ? If this is the case, then I don't see any justification for having srp_destroy_qp at all... -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/core: Make ib_dealloc_pd return void
On 8/11/2015 8:57 AM, Jason Gunthorpe wrote: On Thu, Aug 06, 2015 at 08:31:58PM +0300, Sagi Grimberg wrote: This looks generally good. Would it make sense to go the extra mile here and just fixup ocrdma (the only driver that seems to be able to fail) to WARN_ON() instead of propagating an error and make it go away from the core? I've remarked to the ocrdma folks about fixing this in the past, I'm not sure where they are on that.. I was simply suggesting to move the (specific to ocrdma) WARN_ON() from the core to ocrdma. The ocrdma folks can take care of that if they feel like it. There are also a few other dealloc APIs (mr, etc) that need a similar treatment. I was planning to address the ULP side and driver side in two steps to keep things simpler. mr deallocation can only fail if there are memory windows on it. I don't know if I would bother changing it. Perhaps ULPs that don't use windows (all at the moment) will knowingly ignore it and ULPs that will use windows will be required to check it. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH for-next V7 6/6] IB/ucma: HW Device hot-removal support
From: Jason Gunthorpe [mailto:jguntho...@obsidianresearch.com] It does if you want a planned 'gental' removal to be possible.. There could be a lot of design options for a 'gentle' removal, such as first sending a 'prepare' event, and only then doing the flow proposed here. I do not want to address this now. When you do a surprise removal, you disconnect the application from both ucma and uverbs device references. In this state, the only thing that the application can do is close all handles. It doesn't matter if you succeed closing a resource on a device that is still accessible, or close an already destroyed resource... Hurm. That makese sense, but that isn't entirely what this patch set does - it is rough when facing the user, but the kernel side is actually quite gental.. Ie the RDMA CM still sends GMPs on this shutdown path.. Looks like the various storage ULP shutdowns are pretty gental too. Right. Kernel removal is always gentle, because kernel ULPs are trusted to close within a finite time. As an example, I think it is important, if you want to hot-unplug a storage controller the admin expects a clean storage shutdown. I think in-kernel storage drivers do that already. IB shouldn't really be much different. If a storage stack has user-space components, then our future 'gentle' closing will take care of that. Anyway, let's get this patch-set done. Still needs to have the locking fixed.. As pointed out (https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg27023.html), this barrier scheme is used in several places including in the RDMA stack. If there is no functional issue with the code, we can discuss how to address this scheme systematically later. --Liran -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] IB/sa: Restrict SA Netlink to admin users
From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- ow...@vger.kernel.org] On Behalf Of Jason Gunthorpe Sent: Tuesday, August 11, 2015 1:38 AM To: Weiny, Ira Cc: Haggai Eran; dledf...@redhat.com; linux-rdma@vger.kernel.org Subject: Re: [PATCH] IB/sa: Restrict SA Netlink to admin users On Mon, Aug 10, 2015 at 05:58:30PM -0400, ira.weiny wrote: Furthermore, the check in netlink_bind also uses the socket namespace to restrict the use of multicast. This plus my checks should allow an admin to place the SA proxy (ibacm in our test cases) in an alternate network namespace if they so desire. But this is independent to the namespace which may be used for data applications. I think Haggai is on to something, there is certainly a problem here, that netlink_bind will let a namespace subscribe is a certainly a problem for what Haggai is working on. For now, I think, only root (or CAP_ whatever) in the init namespace should have access to this feature. Not sure how to check that. netlink_capable(skb, CAP_NET_ADMIN) will do the trick. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix possible use-after-free
On 08/11/2015 07:42 AM, Sagi Grimberg wrote: [PATCH] IB/srp: Fix possible protection fault srp_destroy_qp is designed to indicate we are safe to continue with freeing the channel resources by modifying the qp error state, posting a dummy wr on the queue-pair and waiting for it to flush. This also holds for the channel registration pool as we are unmapping the memory region when handling a scsi response. Destroying the channel registration pool before we make sure we processed all the inflight IO might introduce a use-after-free of the registration pool. This use-after-free is demonstrated in the stack trace below where srp is trying to unmap a used FMR after the fmr_pool was already destroyed. Reported-by: Eliott Kespi elio...@mellanox.com Signed-off-by: Sagi Grimberg sa...@mellanox.com Please consider Cc-ing stable for this patch. Anyway, Reviewed-by: Bart Van Assche bvanass...@sandisk.com Sorry for the mixup. Does this patch make more sense? Thank you for the quick respin. By posting this second patch quickly you saved me considerable time. I was going to verify whether any upstream patches were missing from the distro kernel that was used in your tests but this second description makes it clear that scsi_remove_host() was not involved in this crash. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix possible use-after-free
With which kernel version has this been observed ? scsi_remove_host() waits until all outstanding requests have finished. srp_free_ch_ib() is called either before a SCSI host is registered with the SCSI core or after scsi_remove_host() has finished. So I don't see how the above call trace could be triggered with a recent kernel ? Bart, I think I confused in the patch I sent out. The patch I sent was designed to address a theoretical race when deleting a target during live IO. This specific use-after-free occurred in a reconnect flow where scsi_remove_host() is not invoked (assuming that dev_loss_tmo was not invoked). The below patch should address the same race in the reconnect flow: [PATCH] IB/srp: Fix possible protection fault srp_destroy_qp is designed to indicate we are safe to continue with freeing the channel resources by modifying the qp error state, posting a dummy wr on the queue-pair and waiting for it to flush. This also holds for the channel registration pool as we are unmapping the memory region when handling a scsi response. Destroying the channel registration pool before we make sure we processed all the inflight IO might introduce a use-after-free of the registration pool. This use-after-free is demonstrated in the stack trace below where srp is trying to unmap a used FMR after the fmr_pool was already destroyed. general protection fault: [#1] SMP RIP: 0010:[8151121b] [8151121b] _raw_spin_lock_irqsave+0x1b/0x50 Call Trace: [a055d88a] ib_fmr_pool_unmap+0x1a/0xb0 [ib_core] [a06c00ed] srp_unmap_data.isra.28+0x17d/0x250 [ib_srp] [a06c01eb] srp_free_req+0x2b/0x60 [ib_srp] [a06c0c94] srp_recv_completion+0x174/0x580 [ib_srp] [a04580fe] mlx4_eq_int+0x4de/0xe50 [mlx4_core] [a0458b00] mlx4_msi_x_interrupt+0x10/0x20 [mlx4_core] [810abc45] handle_irq_event_percpu+0x35/0x1b0 [810abdf2] handle_irq_event+0x32/0x50 [810ae5cf] handle_edge_irq+0x6f/0x120 [8100455a] handle_irq+0x1a/0x30 [8151b475] do_IRQ+0x45/0xb0 [8151162d] common_interrupt+0x6d/0x6d [813e4d2f] cpuidle_enter_state+0x4f/0xc0 [813e4e6c] cpuidle_idle_call+0xcc/0x210 [8100b9ea] arch_cpu_idle+0xa/0x30 [810ab1e1] cpu_startup_entry+0xe1/0x270 [81030b3a] start_secondary+0x21a/0x2c0 Reported-by: Eliott Kespi elio...@mellanox.com Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/infiniband/ulp/srp/ib_srp.c | 22 +++--- 1 files changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 3a1514c..b220856 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -546,6 +546,17 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch) if (ret) goto err_qp; + if (ch-qp) + srp_destroy_qp(ch); + if (ch-recv_cq) + ib_destroy_cq(ch-recv_cq); + if (ch-send_cq) + ib_destroy_cq(ch-send_cq); + + ch-qp = qp; + ch-recv_cq = recv_cq; + ch-send_cq = send_cq; + if (dev-use_fast_reg dev-has_fr) { fr_pool = srp_alloc_fr_pool(target); if (IS_ERR(fr_pool)) { @@ -570,17 +581,6 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch) ch-fmr_pool = fmr_pool; } - if (ch-qp) - srp_destroy_qp(ch); - if (ch-recv_cq) - ib_destroy_cq(ch-recv_cq); - - ch-qp = qp; - ch-recv_cq = recv_cq; - ch-send_cq = send_cq; - kfree(init_attr); return 0; -- Sorry for the mixup. Does this patch make more sense? Sagi. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next V7 6/6] IB/ucma: HW Device hot-removal support
On Tue, Aug 11, 2015 at 12:47:03PM +, Liran Liss wrote: Still needs to have the locking fixed.. As pointed out (https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg27023.html), this barrier scheme is used in several places including in the RDMA stack. If there is no functional issue with the code, we can discuss how to address this scheme systematically later. That's not an argument. closing has nonsensical locking, the ugly empty mutex is part of that same mess. Jason -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/sa: Restrict SA Netlink to admin users
On Mon, Aug 10, 2015 at 11:38:10PM -0600, Jason Gunthorpe wrote: On Mon, Aug 10, 2015 at 05:58:30PM -0400, ira.weiny wrote: Furthermore, the check in netlink_bind also uses the socket namespace to restrict the use of multicast. This plus my checks should allow an admin to place the SA proxy (ibacm in our test cases) in an alternate network namespace if they so desire. But this is independent to the namespace which may be used for data applications. I think Haggai is on to something, there is certainly a problem here, that netlink_bind will let a namespace subscribe is a certainly a problem for what Haggai is working on. Ok, After thinking about this more I agree. Haggai has a point about the arp tables. Like I said I'm not a namespace expert. For now, I think, only root (or CAP_ whatever) in the init namespace should have access to this feature. Not sure how to check that. For these 2 checks it is easy to change to netlink_capable instead of netlink_net_capable. Even allowing a namespace to subscribe is problematic because it will cause timeouts to hit.. Not sure what to do about that.. Ok, I look into how to deal with the netlink_bind. I _think_ this may require the RDMA netlink to provide a custom bind call. :-( Also, why the incremental patch? The original isn't ready for mainline without the message validation stuff.. Mainly because Kaike was on vacation and I was not sure what Doug would prefer. Kaike and I have discussed a couple of changes he had queued up so we will need a v9 so we will merge this into his next v9 submission. Ira -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix possible use-after-free
On 08/11/2015 07:42 AM, Sagi Grimberg wrote: diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 3a1514c..b220856 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -546,6 +546,17 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch) if (ret) goto err_qp; + if (ch-qp) + srp_destroy_qp(ch); + if (ch-recv_cq) + ib_destroy_cq(ch-recv_cq); + if (ch-send_cq) + ib_destroy_cq(ch-send_cq); + + ch-qp = qp; + ch-recv_cq = recv_cq; + ch-send_cq = send_cq; + if (dev-use_fast_reg dev-has_fr) { fr_pool = srp_alloc_fr_pool(target); if (IS_ERR(fr_pool)) { @@ -570,17 +581,6 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch) ch-fmr_pool = fmr_pool; } - if (ch-qp) - srp_destroy_qp(ch); - if (ch-recv_cq) - ib_destroy_cq(ch-recv_cq); - - ch-qp = qp; - ch-recv_cq = recv_cq; - ch-send_cq = send_cq; - kfree(init_attr); return 0; Sorry for the mixup. Does this patch make more sense? On second thought ... with your patch, if the goto err_qp branch in srp_create_ch_ib() is taken upon return ch-qp, ch-recv_cq and ch-send_cq will be dangling pointers. That will have bad consequences in the subsequent srp_free_ch_ib() call. How about replacing the above patch with the (untested) patch below ? Thanks, Bart. [PATCH] IB/srp: Avoid that a completion during reconnect causes a crash Untested. --- drivers/infiniband/ulp/srp/ib_srp.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 2968b7b..1f9ed68 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -554,9 +554,6 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch) FR pool allocation failed (%d)\n, ret); goto err_qp; } - if (ch-fr_pool) - srp_destroy_fr_pool(ch-fr_pool); - ch-fr_pool = fr_pool; } else if (!dev-use_fast_reg dev-has_fmr) { fmr_pool = srp_alloc_fmr_pool(target); if (IS_ERR(fmr_pool)) { @@ -565,9 +562,6 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch) FMR pool allocation failed (%d)\n, ret); goto err_qp; } - if (ch-fmr_pool) - ib_destroy_fmr_pool(ch-fmr_pool); - ch-fmr_pool = fmr_pool; } if (ch-qp) @@ -577,6 +571,16 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch) if (ch-send_cq) ib_destroy_cq(ch-send_cq); + if (dev-use_fast_reg dev-has_fr) { + if (ch-fr_pool) + srp_destroy_fr_pool(ch-fr_pool); + ch-fr_pool = fr_pool; + } else if (!dev-use_fast_reg dev-has_fmr) { + if (ch-fmr_pool) + ib_destroy_fmr_pool(ch-fmr_pool); + ch-fmr_pool = fmr_pool; + } + ch-qp = qp; ch-recv_cq = recv_cq; ch-send_cq = send_cq; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/sa: Restrict SA Netlink to admin users
On Tue, Aug 11, 2015 at 05:27:17AM -0600, Wan, Kaike wrote: From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- ow...@vger.kernel.org] On Behalf Of Jason Gunthorpe Sent: Tuesday, August 11, 2015 1:38 AM To: Weiny, Ira Cc: Haggai Eran; dledf...@redhat.com; linux-rdma@vger.kernel.org Subject: Re: [PATCH] IB/sa: Restrict SA Netlink to admin users On Mon, Aug 10, 2015 at 05:58:30PM -0400, ira.weiny wrote: Furthermore, the check in netlink_bind also uses the socket namespace to restrict the use of multicast. This plus my checks should allow an admin to place the SA proxy (ibacm in our test cases) in an alternate network namespace if they so desire. But this is independent to the namespace which may be used for data applications. I think Haggai is on to something, there is certainly a problem here, that netlink_bind will let a namespace subscribe is a certainly a problem for what Haggai is working on. For now, I think, only root (or CAP_ whatever) in the init namespace should have access to this feature. Not sure how to check that. netlink_capable(skb, CAP_NET_ADMIN) will do the trick. For these calls yes. For the bind call I think we need to investigate more. Ira -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 00/12] IB: Replace safe uses for ib_get_dma_mr with pd-local_dma_lkey
On 08/01/2015 01:05 PM, Doug Ledford wrote: On 07/31/2015 07:14 PM, Jason Gunthorpe wrote: On Fri, Jul 31, 2015 at 04:04:35PM -0700, Bart Van Assche wrote: With only patches 1/12 and 8/12 applied my test passes. Thanks Bart! If you like, I can try and work on #9 (no promises :(), but I think I'll need some way to test it here. Do you by chance have a straightforward recipe to setup SRP and SRPT on two Linux's for this simple purpose? Doug, just drop #9 'IB/srp: Do not create an all physical insecure rkey by default' please, thanks. I've pulled in the series for 4.3. I've included patch #9, but that's on the understanding that if it can't be fixed before the initial pull request, I'll drop it at that time. I have the resources to test this, so I can work on it, and that factors into my path forward here. Hi Doug, Do you feel comfortable enough to queue the nine patches that replace Jason's patch #9 for kernel v4.3 (see also http://thread.gmane.org/gmane.linux.drivers.rdma/28153/focus=28419) or would you rather prefer to take out patch #9 and queue these nine patches for kernel v4.4 ? Please advise. Thanks, Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] pull request: linux-firmware: Add Intel OPA hfi1 firmware
On Tue, Aug 11, 2015 at 05:32:06PM -0400, ira.weiny wrote: Doug, Please Pull: git://github.com/weiny2/linux-firmware.git master-hfi1-firmware This is the first release of the Intel OPA hfi1 firmware required by the hfi1 driver. Um, I'm not a lawyer, but that license does not look OK for the firmware repo.. At least I've never seen a redistributable license with stuff like this: (1)reproduce and execute the Software only for internal use with Intel Products, including designing products for Intel Products,; this license does not include the right to sublicense, and may be exercised only within Your facilities by Your employees; You should talk internally and ask why this needs to be different from other Intel firmware blob licenses like i915 .. Jason -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] pull request: linux-firmware: Add Intel OPA hfi1 firmware
On Tue, Aug 11, 2015 at 10:47:03PM +, Vogel, Steve wrote: The license terms allow anyone to distribute (but not sell) the firmware but only for use on Intel products. Redistribution alone may be enough to be included in linux-firmware However, most of the additional terms (and there are lots of them) this imposes beyond the usual likely make it impossible to include in a distro, so pragmatically, there is no reason to push for inclusion in linux-firmware. This is going to be a hard road for you guys. Falling in line with every other Intel firmware blob's (i915, ibt, iwlwifi, SST2) license would be much easier on you and the distros. Frankly, I think the onus is on you to get statements from the licensing teams at Fedora, Debian, RH and SuSE on if they can include work under this license or not. I suspect Fedora and Debian will both say they can't, just based on their public policies and the additional restrictions in this license.. But hey, I'm not a licensing lawyer.. Jason -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html