I'm looking at porting an app that maintains a work queue to be processed by one of N engines and written out in order. At first glance, std.parallelism already provides the queue, but the Task concept appears to assume that there's no startup cost per thread.
Am I missing something or do I need to roll a shared queue object?
