Re: [PATCH rfc 5/6] block: Add rdma affinity based queue mapping helper
shouldn't you include and like in commit 8ec2ef2b66ea2f that fixes blk-mq-pci.c ? Not really. We can lose these from blk-mq-pci.c as well. +#include +#include +#include +#include +#include "blk-mq.h" Is this include needed ? You're right, I can just keep: +#include +#include +#include
Re: [PATCH rfc 5/6] block: Add rdma affinity based queue mapping helper
On 04/02/2017 07:41 AM, Sagi Grimberg wrote: > Like pci and virtio, we add a rdma helper for affinity > spreading. This achieves optimal mq affinity assignments > according to the underlying rdma device affinity maps. Reviewed-by: Jens Axboe-- Jens Axboe
Re: [PATCH rfc 5/6] block: Add rdma affinity based queue mapping helper
On Tue, Apr 04, 2017 at 10:46:54AM +0300, Max Gurtovoy wrote: >> +if (set->nr_hw_queues > dev->num_comp_vectors) >> +goto fallback; >> + >> +for (queue = 0; queue < set->nr_hw_queues; queue++) { >> +mask = ib_get_vector_affinity(dev, first_vec + queue); >> +if (!mask) >> +goto fallback; > > Christoph, > we can use fallback also in the blk-mq-pci.c in case pci_irq_get_affinity > fails, right ? For PCI it shouldn't fail as the driver calling pci_irq_get_affinity knows how it set up the interrupts. So I don't think it's necessary there.
Re: [PATCH rfc 5/6] block: Add rdma affinity based queue mapping helper
diff --git a/block/blk-mq-rdma.c b/block/blk-mq-rdma.c new file mode 100644 index ..d402f7c93528 --- /dev/null +++ b/block/blk-mq-rdma.c @@ -0,0 +1,56 @@ +/* + * Copyright (c) 2017 Sagi Grimberg. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ shouldn't you include and like in commit 8ec2ef2b66ea2f that fixes blk-mq-pci.c ? +#include +#include +#include +#include +#include "blk-mq.h" Is this include needed ? + +/** + * blk_mq_rdma_map_queues - provide a default queue mapping for rdma device + * @set: tagset to provide the mapping for + * @dev: rdma device associated with @set. + * @first_vec: first interrupt vectors to use for queues (usually 0) + * + * This function assumes the rdma device @dev has at least as many available + * interrupt vetors as @set has queues. It will then query it's affinity mask + * and built queue mapping that maps a queue to the CPUs that have irq affinity + * for the corresponding vector. + * + * In case either the driver passed a @dev with less vectors than + * @set->nr_hw_queues, or @dev does not provide an affinity mask for a + * vector, we fallback to the naive mapping. + */ +int blk_mq_rdma_map_queues(struct blk_mq_tag_set *set, + struct ib_device *dev, int first_vec) +{ + const struct cpumask *mask; + unsigned int queue, cpu; + + if (set->nr_hw_queues > dev->num_comp_vectors) + goto fallback; + + for (queue = 0; queue < set->nr_hw_queues; queue++) { + mask = ib_get_vector_affinity(dev, first_vec + queue); + if (!mask) + goto fallback; Christoph, we can use fallback also in the blk-mq-pci.c in case pci_irq_get_affinity fails, right ? + + for_each_cpu(cpu, mask) + set->mq_map[cpu] = queue; + } + + return 0; +fallback: + return blk_mq_map_queues(set); +} +EXPORT_SYMBOL_GPL(blk_mq_rdma_map_queues); Otherwise, Looks good. Reviewed-by: Max Gurtovoy
Re: [PATCH rfc 5/6] block: Add rdma affinity based queue mapping helper
On Sun, Apr 02, 2017 at 04:41:31PM +0300, Sagi Grimberg wrote: > Like pci and virtio, we add a rdma helper for affinity > spreading. This achieves optimal mq affinity assignments > according to the underlying rdma device affinity maps. > > Signed-off-by: Sagi Grimberg> --- > block/Kconfig | 5 > block/Makefile | 1 + > block/blk-mq-rdma.c | 56 > + > include/linux/blk-mq-rdma.h | 10 > 4 files changed, 72 insertions(+) > create mode 100644 block/blk-mq-rdma.c > create mode 100644 include/linux/blk-mq-rdma.h > > diff --git a/block/Kconfig b/block/Kconfig > index 89cd28f8d051..3ab42bbb06d5 100644 > --- a/block/Kconfig > +++ b/block/Kconfig > @@ -206,4 +206,9 @@ config BLK_MQ_VIRTIO > depends on BLOCK && VIRTIO > default y > > +config BLK_MQ_RDMA > + bool > + depends on BLOCK && INFINIBAND > + default y > + > source block/Kconfig.iosched > diff --git a/block/Makefile b/block/Makefile > index 081bb680789b..4498603dbc83 100644 > --- a/block/Makefile > +++ b/block/Makefile > @@ -26,6 +26,7 @@ obj-$(CONFIG_BLK_CMDLINE_PARSER)+= cmdline-parser.o > obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o > obj-$(CONFIG_BLK_MQ_PCI) += blk-mq-pci.o > obj-$(CONFIG_BLK_MQ_VIRTIO) += blk-mq-virtio.o > +obj-$(CONFIG_BLK_MQ_RDMA)+= blk-mq-rdma.o > obj-$(CONFIG_BLK_DEV_ZONED) += blk-zoned.o > obj-$(CONFIG_BLK_WBT)+= blk-wbt.o > obj-$(CONFIG_BLK_DEBUG_FS) += blk-mq-debugfs.o > diff --git a/block/blk-mq-rdma.c b/block/blk-mq-rdma.c > new file mode 100644 > index ..d402f7c93528 > --- /dev/null > +++ b/block/blk-mq-rdma.c > @@ -0,0 +1,56 @@ > +/* > + * Copyright (c) 2017 Sagi Grimberg. > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + */ > +#include > +#include > +#include > +#include > +#include "blk-mq.h" > + > +/** > + * blk_mq_rdma_map_queues - provide a default queue mapping for rdma device > + * @set: tagset to provide the mapping for > + * @dev: rdma device associated with @set. > + * @first_vec: first interrupt vectors to use for queues (usually 0) > + * > + * This function assumes the rdma device @dev has at least as many available > + * interrupt vetors as @set has queues. It will then query it's affinity > mask > + * and built queue mapping that maps a queue to the CPUs that have irq > affinity > + * for the corresponding vector. > + * > + * In case either the driver passed a @dev with less vectors than > + * @set->nr_hw_queues, or @dev does not provide an affinity mask for a > + * vector, we fallback to the naive mapping. > + */ > +int blk_mq_rdma_map_queues(struct blk_mq_tag_set *set, > + struct ib_device *dev, int first_vec) > +{ > + const struct cpumask *mask; > + unsigned int queue, cpu; > + > + if (set->nr_hw_queues > dev->num_comp_vectors) > + goto fallback; maybe print a warning here? Otherwise looks fine: Reviewed-by: Christoph Hellwig
[PATCH rfc 5/6] block: Add rdma affinity based queue mapping helper
Like pci and virtio, we add a rdma helper for affinity spreading. This achieves optimal mq affinity assignments according to the underlying rdma device affinity maps. Signed-off-by: Sagi Grimberg--- block/Kconfig | 5 block/Makefile | 1 + block/blk-mq-rdma.c | 56 + include/linux/blk-mq-rdma.h | 10 4 files changed, 72 insertions(+) create mode 100644 block/blk-mq-rdma.c create mode 100644 include/linux/blk-mq-rdma.h diff --git a/block/Kconfig b/block/Kconfig index 89cd28f8d051..3ab42bbb06d5 100644 --- a/block/Kconfig +++ b/block/Kconfig @@ -206,4 +206,9 @@ config BLK_MQ_VIRTIO depends on BLOCK && VIRTIO default y +config BLK_MQ_RDMA + bool + depends on BLOCK && INFINIBAND + default y + source block/Kconfig.iosched diff --git a/block/Makefile b/block/Makefile index 081bb680789b..4498603dbc83 100644 --- a/block/Makefile +++ b/block/Makefile @@ -26,6 +26,7 @@ obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o obj-$(CONFIG_BLK_MQ_PCI) += blk-mq-pci.o obj-$(CONFIG_BLK_MQ_VIRTIO)+= blk-mq-virtio.o +obj-$(CONFIG_BLK_MQ_RDMA) += blk-mq-rdma.o obj-$(CONFIG_BLK_DEV_ZONED)+= blk-zoned.o obj-$(CONFIG_BLK_WBT) += blk-wbt.o obj-$(CONFIG_BLK_DEBUG_FS) += blk-mq-debugfs.o diff --git a/block/blk-mq-rdma.c b/block/blk-mq-rdma.c new file mode 100644 index ..d402f7c93528 --- /dev/null +++ b/block/blk-mq-rdma.c @@ -0,0 +1,56 @@ +/* + * Copyright (c) 2017 Sagi Grimberg. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ +#include +#include +#include +#include +#include "blk-mq.h" + +/** + * blk_mq_rdma_map_queues - provide a default queue mapping for rdma device + * @set: tagset to provide the mapping for + * @dev: rdma device associated with @set. + * @first_vec: first interrupt vectors to use for queues (usually 0) + * + * This function assumes the rdma device @dev has at least as many available + * interrupt vetors as @set has queues. It will then query it's affinity mask + * and built queue mapping that maps a queue to the CPUs that have irq affinity + * for the corresponding vector. + * + * In case either the driver passed a @dev with less vectors than + * @set->nr_hw_queues, or @dev does not provide an affinity mask for a + * vector, we fallback to the naive mapping. + */ +int blk_mq_rdma_map_queues(struct blk_mq_tag_set *set, + struct ib_device *dev, int first_vec) +{ + const struct cpumask *mask; + unsigned int queue, cpu; + + if (set->nr_hw_queues > dev->num_comp_vectors) + goto fallback; + + for (queue = 0; queue < set->nr_hw_queues; queue++) { + mask = ib_get_vector_affinity(dev, first_vec + queue); + if (!mask) + goto fallback; + + for_each_cpu(cpu, mask) + set->mq_map[cpu] = queue; + } + + return 0; +fallback: + return blk_mq_map_queues(set); +} +EXPORT_SYMBOL_GPL(blk_mq_rdma_map_queues); diff --git a/include/linux/blk-mq-rdma.h b/include/linux/blk-mq-rdma.h new file mode 100644 index ..b4ade198007d --- /dev/null +++ b/include/linux/blk-mq-rdma.h @@ -0,0 +1,10 @@ +#ifndef _LINUX_BLK_MQ_RDMA_H +#define _LINUX_BLK_MQ_RDMA_H + +struct blk_mq_tag_set; +struct ib_device; + +int blk_mq_rdma_map_queues(struct blk_mq_tag_set *set, + struct ib_device *dev, int first_vec); + +#endif /* _LINUX_BLK_MQ_RDMA_H */ -- 2.7.4