Hello evbd, afaik Martin Odersky once described a app as _a flow of typesafe transformations_. These days one might like to add _efficient multiprogram type transformations in a shared-memory environment_.
Recently i've spent some time to look at a couple of research-papers (and some lecture videos) concerned with concurrent a.k.a. _" lock-free"_ data-structures. The earliest papers were published around the year 2000 describing a concept named 'Read-Copy-Update' (RCU) which had been implemented into the Linux-kernel around ~2004 and was released as user-space-lib under the name 'Userspace-RCU' four years after that. RCU is a brainchild of Paul E. McKenney, who is currently concerned with the memory-model of the Linux-kernel. Later Hazard-Pointers (HP) were invented and coined by Maged Michael. In 2017 both concepts have been accepted by the C++-Comittee into a Technical-Specification (TS) to be worked out in the coming years. Early adoption can be seen & tested in Facebooks "Folly"-library. Another interesting 'flavour' of RCU has been published by DPDK and was presented by Nathan Brown in february at FOSDEM-2022 [ <https://fosdem.org/2022/schedule/event/comparing_dpdk_rcu_and_user_space_rcu_library/> ]. The above mentioned concepts deal with the problem of safe-memory-reclamation. Authors group these concepts in two families - epoch-based=RCU, Pointer-based=Hazard-Pointers - and hybrids of both (e.g. Hazard-Eras). I leave out (Software)-Transactional-Memory (STM) here - as long as it remains a hackers-wet-dream - instead trying to focus on the CPP-Comittees accepted concepts, that will prbl. get finalized into the C++-Standard in 2026. Looking at Nims `std/lib` and nimble-packages i wonder why there are so few attempts to introduce concurrent data-structures ? Many have expressed the wish for better concurrency support, since `threads:on` will become the default in 2.0. The lock-based List/Table-implementations from `std/shared` are nice-to-have, but prbl. not too important for future projects. Don't get me wrong here - locks will remain important in the future, but please not in concurrent data-structures that see reads&updates from many threads. Having collected a series of papers describing breakthru inventions & implementations of (b)lock-free data-structures e.g. Harris Linked-Lists 2002, Shavit/Shalev Lock-free extensible Hash-Tables (~2006) based on 'split-ordered-lists'. Those have been tested and the (amazing) gains-in-performance have been reported and confirmed in many later publication by different authors. I regard these as building-blocks for any concurrent-DS to come. So following along a quote by **G. van-Rossums** _" When you approach a new language, first look and measure the quality of the provided data-structures"_. I've done exacly this and learned that all of Nims DS are well-crafted and very-performant - esp. the `std/tables` is a performance-beast :) BUT as i type along on my Intel i5 (2014) with 4-HW-threads, i'm sure my next machine will be either a M-something, RISC-V or a modern Intel/AMD-thingy and both will provide 20-cores or more. And in that respect i'd like to see Nim 2.0 beeing able to provide data-structures that can be safely and efficiently used in a lock-free manner by multiple threads. So the pole-position on my wish-list for 2.0 is a new lib/module, filled with concurrent-DS for `Seq/Table/Set/List`, that are (as-much as possible) API-compatible with the current `std-Seq/Set/Table`-implementations. So that current projects might only need to exchange one import-line and start to benefit from better thruput or can exchange `std/shared`-types. New projects would naturally base their code on the concurrent-DS-types, without worrying about memory-reclamation-strategies. I'm sure this will provoke mentions on HN, along the lines _Look Mama 'Nim adopts lock-free DS.'_ I'd like to publish my tiny collection of 'milestone'-papers and make them available to evbd. interested. Maybe at the Nim-Wiki, or linked from my github or nice&shiny within [ <http://Markdown.com> ] or elsewhere - i'm open for suggestions. Many times i came across warnings saying : _" In mutlithreading and esp. threadsafe DS, even the most experienced hackers have very-good chances to get it wrong."_. Besides the fact-of-life _" Failure is always a option"_ yes - that might be true for the crafting of **multithreaded lock-based** DS like trees. Looking at Harris Linked-List or Shavit/Shalevs new (amazingly-brilliant) concepts i'd say these are stand-out because of their simple design. Beeing a rather mediocre hacker, i regard these as doable. The problem of 'safe-memory-reclamation' is still a hot - or at least warm - topic in research. More recent works like Interval-based-Reclamation (IBR/2018) or Variant-based-Reclamation (VBR/2021) are RCU-flavours with sophisticated additions. They seem to be very promising and my impression is, that the implementations are strictly NOT rocket-science. Again, things can be studied and tried publicly using FB 'Folly' or DPDK-RCU. ARAQ once mentioned in one of his musings that he believes either STM or Memory-Pools might be a good fit for Nim in the future - thus further taking away workloads from Nims-GC's. The VBR-paper from 2018 proposes exactly that - they test the use of _type-preserving_ Memory-Pools. Their performance results are quite good. Having that said - i don't expect that any form of language-support is needed here - rather the opposite. On the other hand, maybe some-day-in-the-future it turns out that the Nim-runtime has more oversight on a Apps usage of type-preserving-pools. Maybe one lets the VM decide or suggest upon efficient dimensions for such pools. Or in case the VM detects the use of a 'non-threadsafe-DS' during compilation and decides to abort or warn the user. The basic node-structures of e.g. Harris-Linked-Lists and the proposed HashMap/Sets by Shavit/Shalev are identical, and thus can be consumed and recycled from/to the same pool. Maybe the VM can handle/reason differently on concurrent-DS on the shared-heap ? For new users of Nim such a introduction might be a attractive feat and i find that a Version 2.0 deserves better than (retiring stuff and cleaning-up the std-lib), instead a feature that increases efficiency and will clearly go mainstream soon (within the next 2-4 yrs.) and that surprisingly 'the other crews' ( Odin/Zig/Hare/Go/V(sry, i had to)) do not provide yet - can't say if the debate it. As a side-note: The more pseudo-code i study from these papers, the more i assume one could easily provide real-nim-code that is closer-or-maybe-identical to the provided code-samples. So finally part of this is about style and ease-of-use. A good example is Nims `withLock`-template. It prevents errors (aquired but release forgotten), marks and indents the critical-section and looks good. In my RCU-wrap, you mark & indent the critical-section exactly so and understanding of what happens behind the scene on entering and leaving is not required. But just-in-case one attempts to call `synchronize_rcu` within a critical-section, that will not-compile and instead abort with a understandable error-explanation. I believe this reads well and provides for ease-of-use and safety. I considered writing a RFC on this topic, but since it _does and shall not_ demand changes to the language i decided to ask for some feedback upfront. So far i've tried : * wrapped Userspace-RCU for Nim ( *nix-only ) * implemented a HAMT, that the authors claimed to have 80-percent the thruput of a Hash-Map. Learned than that such a complex-structure remains 20-% performance - maybe its all different in Clojure-land or maybe its just me.. Still believe a HAMT can be made lock-free and threadsafe. * found Folly ( C++, all platforms ), sry, too dumb, cannot wrap that. * tried to make a wrap out of DPDK-RCU ( *nix & win) - does not compile on osx. * studied and started implementing Harris Linked-List and Shavit/Shalev Set/HashMap * collected, read and picked papers following the topic 'safe memory reclamation' * tried my best to identify new ideas&concepts that might be the right choices for Nim 2.0 So i'd love to hear your ideas, critique or warnings maybe regarding possible conflicts with other planned-features alike CSP/Weave. beatz & greetz, Andreas
