On Tue, Feb 09, 2021 at 06:38:43AM -0600, Mike Christie wrote:
> Doing a work per cmd can lead to lots of threads being created.
> This patch just replaces the completion work per cmd with a per cpu
> list. Combined with the first patches this allows tcm loop on top of
> initiators like iser to go from around 700K IOPs to 1000K and reduces
> the number of threads that get created when the system is under heavy
> load and hitting the initiator drivers tagging limits.

OTOH it does increase completion latency, which might be the preference
for some workloads.  Do we need a tunable here?

> +static void target_queue_cmd_work(struct se_cmd_queue *q, struct se_cmd 
> *se_cmd,
> +                               int cpu, struct workqueue_struct *wq)
>  {
> -     struct se_cmd *cmd = container_of(work, struct se_cmd, work);
> +     llist_add(&se_cmd->se_cmd_list, &q->cmd_list);
> +     queue_work_on(cpu, wq, &q->work);
> +}

Do we need this helper at all?  Having it open coded in the two callers
would seem easier to follow to me.

_______________________________________________
Virtualization mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to