Of course there is still CPS, where the overhead of a continuation is only tens of bytes, no stacks need to be allocated and no support from the OS is needed to switch contexts.
CPS has been used to build coroutines, actors, fibers and more, and easily scales to thousands or millions of concurrent flows of control. <https://github.com/nim-works/cps>