On Tue, Sep 29, 2015 at 10:47 PM, Jens Axboe <ax...@kernel.dk> wrote: > On 09/29/2015 08:26 AM, Keith Busch wrote: >> >> On Mon, 28 Sep 2015, Ming Lei wrote: >>> >>> This patchset introduces .map_changed callback into 'struct blk_mq_ops', >>> and use this callback to get NVMe notified about the mapping changed >>> event, >>> then NVMe can update the irq affinity hint for its queues. >> >> >> I think this is going the wrong direction. Shouldn't we provide blk-mq >> the vectors in the tag set so that layer can manage the irq hints? >> >> This could lead to more cpu-queue assignment optimizations from using >> that information. For example, two h/w contexts sharing the same vector >> shouldn't be assigned to cpus on different NUMA nodes. > > > I agree, this is moving in the wrong direction. Currently the sw <->hw queue > mappings are in blk-mq, and this is the exact same information base we need > for IRQ affinity handling. We need to move in the direction of having blk-mq > helpers handle that part too, not pass notifications to the lower level > driver to update its IRQ mappings.
Yes, I thought of that before, but it has the following cons: - some drivers/devices may need different IRQ affinity policy, such as virtio devices which has its own set affinity handler(see virtqueue_set_affinity()), and it is offten not efficient to handle the virt queue's irq on more than one CPU. - block core has to get the irq vector information which has to be setup/finalized before blk-mq uses that for setting irq affinity, for example, in case NVMe's admin queue, its vector can be changed after admin queue's initialization. That is why I said this approach is more flexible. > >>> Also the 'cpumask' in 'struct blk_mq_tags' isn't needed any more, so >>> remove >>> that and related kernel interface. >> >> >> It was added to the tags because the cpu mask is an artifact of the >> tags rather that duplicating it across all the h/w contexts sharing the >> same set. It also doesn't let a h/w context from one namespace overwrite >> another's cpu affinity mask when they share the same vector. > > > So having the mask in the tags is really odd, it should be in some > per-device type data instead. Agree, removing the mask in tags is one of this patchset's motivation. -- Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/