On Thu, Jul 15, 2010 at 07:30:41PM +0200, Andrej Podzimek wrote: > >I don't know why you're having problems, but for efficiency's sake the > >amount of work being dispatched to the task queue should be at least > >(handwaving wildly) 10-100x more time consuming that the operation of > >dispatching itself. > > Well, this is exactly why I am having problems. The dispatched tasks > are so short that dispatching can easily take (shooting in the dark) > 20 times longer than the tasks themselves. Consequently, the task > queue threads need to hold their bucket mutexes locked for 95% of > their effective running time. This is not (yet) disastrous, since > there are per CPU buckets. However, once all buckets get full, most > threads (and especially the task producers) start to compete for the > global task queue mutex, which is what I observe. They spin all the > time and it is just a matter of chance that those 100 tasks per second > eventually get dispatched.
Task queues are useful when you have: a) a task that can take a fairly long amount of time to process, possibly unbounded, or just involving high latencies (e.g., disk I/O), b) a thread that must produce a result quickly. For example, in the secure NFS stack we used to have nfsd threads blocking on upcalls to gssd(1M). This was problematic because nfsd is designed to have a fixed number of threads that at most block on a single filesystem transaction, but gssd could take longer than that to complete an upcall. So we changed the secure RPC implementation to queue up the part of the RPCSEC_GSS handling that requires upcalls to gssd; the nfsd thread unwinds immediately after queueing that task. There are probably many task queue examples where the latency mismatch between the caller and the task is less impressive, but there should always be some such latency mismatch, otherwise there's no point to using task queues. > What I originally tested was a workload with big bursts (== millions) > of small tasks that take just a few instructions in most cases, but > some of them (perhaps one in a thousand) *might* sleep. Batching tasks > works fine (as already mentioned), but then none of the tasks can > sleep. So I tried the task queues, but they are obviously not designed > for this type of workload. As you say, a task would have to take much > more time than the dispatching overhead to keep mutex contention > acceptable. Can you re-organize the tasks such that you only queue the tasks that must sleep? For example, if the sleeping comes from kmem_*alloc() then you could use KM_NOSLEEP instead of KM_SLEEP and then queue up tasks only when the allocation fails (the queued task will use KM_SLEEP, of course). Nico -- _______________________________________________ on-discuss mailing list on-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/on-discuss