I mostly code like this now: data.map!(x => transform(x)).copy(output);
It's convenient and reads nicely, but it's generally inefficient. This sort of one-by-one software design is the core performance problem with OOP. It seems a shame to be suffering OOP's failures even when there is no OOP in sight. A central premise of performance-oriented programming which I've employed my entire career, is "where there is one, there is probably many", and if you do something to one, you should do it to many. With this in mind, the code I have always written doesn't tend to look like this: R manipulate(Thing thing); Instead: void manipulateThings(Thing *things, size_t numThings, R *output, size_t outputLen); Written this way for clarity. Obviously, the D equiv uses slices. All functions are implemented with the presumption they will operate on many things, rather than being called many times for each one. This is the single most successful design pattern I have ever encountered wrt high-performance code; ie, implement the array version first. The problem with this API design, is that it doesn't plug into algorithms or generic code well. data.map!(x => transformThings(&x, 1)).copy(output); I often wonder how we can integrate this design principle conveniently (ie, seamlessly) into the design of algorithms, such that they can make use of batching functions internally, and transparently? Has anyone done any work in this area? Ideas?
