Someone asked about `nlvm` and the performance differences - in libraries like this, with CPU-heavy tight loops, the tricks that `nlvm` uses to generate more performant code for range checks and exception raising shines in particular: the `nlvm`-compiled benchmark of this library shows a 20% throughput increase for both compression and decompression without loss of safety / functionality.
Of course, one could write more messy nim code that removes the range checking and more tightly controls loop unrolling etc using more casts, pointers and other unsafe constructs, but that defeats the purpose of writing it in Nim to begin with - the role of an optimizing compiler after all is to contextualise the code and make sure it performs the best it can in the context it's used, eliding safety mechanisms only in a provable way.
