I've been struggling with the fridge threads.  For NFS/RDMA,
once we are running a work thread, we really don't want to
hand off to another work thread during processing in
thr_decode_rpc_request() -- adds considerable latency.

Panasas also reports latency and stalls in VFS::PanFS.

Currently, the fridge starts processing some work, then stops
and re-queues a work request nfs_rpc_enqueue_req(nfsreq) at
the end of thr_decode_rpc_request().

By the comments in nfs_rpc_dispatcher_thread.c:

  * Next, the preferred dispatch thread should be, I speculate, one
  * which has (most) recently handled a request for this xprt.

("I" isn't identified.)

So there is extra complication in nfs_rpc_getreq_ng()
  * calling fridgethr_submit(req_fridge, thr_decode_rpc_requests, xprt),
  ** which in turn runs thr_decode_rpc_requests()
  *** to loop on any multiple requests per xprt,
  **** handing each to a separate worker.

All to attempt "locality of reference" for a worker.

This is particularly bad for RDMA, as serializing multiple
requests this way means the most buffers have to be held
outstanding at a time!

Perhaps a simpler and more efficient design would borrow from
Internet Routing: Weighted Fair Queuing.

We could more easily insert jobs into a weighted array of queues,
and then the thread keeps going without any handoff until done
(or another wait for event).

If after completing a req, then another req for the same xprt is
found, the next req should be moved to the end of the weighted
queue, so that other xprts aren't treated unfairly.

Those are the two basic elements of Weighted Fair Queuing (WFQ).

------------------------------------------------------------------------------
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Reply via email to