Re: [Blog post] Why and when you should use SoA

maik klein via Digitalmars-d-announce Sat, 26 Mar 2016 17:46:23 -0700

On Saturday, 26 March 2016 at 23:31:23 UTC, Alex Parrill wrote:

On Friday, 25 March 2016 at 01:07:16 UTC, maik klein wrote:
Link to the blog post: https://maikklein.github.io/post/soa-d/
Link to the reddit discussion:https://www.reddit.com/r/programming/comments/4buivf/why_and_when_you_should_use_soa/
I think structs-of-arrays are a lot more situational than youmake them out to be.
You say, at the end of your article, that "SoA scales muchbetter because you can partially access your data withoutneedlessly loading unrelevant data into your cache". But mostof the time, programs access struct fields close together intime (i.e. accessing one field of a struct usually means thatyou will access another field shortly). In that case, you'venow split your data across multiple cache lines; not good.
Your ENetPeer example works against you here; the thepacketThrottle* variables would be split up into differentarrays, but they will likely be checked together whenthrottling packets. Though admittedly, it's easy to fix; putfields likely to be accessed together in their own struct.
The SoA approach also makes random access more inefficient andmakes it harder for objects to have identity. Again, yourENetPeer example works against you; it's common for servers toneed to send packets to individual clients rather thanbroadcasting them. With the SoA approach, you end up accessinga tiny part of multiple arrays, and load several cache linescontaining data for ENetPeers that you don't care about (i.e.loading irrelevant data).
I think SoA can be faster if you are commonly iterating over asection of a dataset, but I don't think that's a commonoccurrence. I definitely think it's unwarranted to concludethat SoAs "scale much better" without noting when they scalebetter, especially without benchmarks.
I will admit, though, that the template for making thestruct-of-arrays is a nice demonstration of D's templates.

The next blog post that I am writing will contain a fewbenchmarks for SoA vs AoS.

But most of the time, programs access struct fields closetogether in time (i.e. accessing one field of a struct usuallymeans that you will access another field shortly). In thatcase, you've now split your data across multiple cache lines;not good.

You can still group the data together if you always access ittogether. What you wrote is actually not true for arrays, atleast the way you wrote it.


Array!Foo arr

Iterating over 'arr', you will always load the complete Foostruct into memory, unless you hide stuff behind pointers.

The SoA approach also makes random access more inefficient andmakes it harder for objects to have identity.

No it actually makes it much better because you only have to loadthe relevant stuff into memory.


But you usually don't look at your objects in isolation.

AoS makes sense if you always care about all fields like forexample Array!Vector3. You usually access all components of avector.


What you lose is the general feel of oop.

Vector add(Vector a, Vector b);

Array!Vector vectors;

add(vectors[index1], vectors[index2]);

This really just won't work with SoA, especially if you want tomutate the data behind with a reference. For this you would justuse AoS.

Btw I have done a lot of benchmarks and SoA in the worst case wasalways as fast as SoA.

But once you actually only access partial data, SoA canpotentially be much faster.


This is what I mean with scaling

You start with

struct Test{
  int i;
  int j;
}
Array!Test tests;

and you have absolutely no performance problem for 'tests'because it is just so small.


But after a few years Test will have grown much bigger.

struct Test{
  int i;
  int j;
  int[100] junk;
}

If you use SoA you can always add stuff without any performancepenalty, that is why I said that it "scales" better.

But as I have said in the blog post, you will not always replaceAoS with SoA, but you should replace AoS with SoA where it makessense.

I think SoA can be faster if you are commonly iterating over asection of a dataset, but I don't think that's a commonoccurrence.

This happens in games very often when you use inheritance, yourobjects just will grow really big the more functionality you add.

Like for example you just want to move all objects based onvelocity, so you just care about Position, Velocity. You don'thave to load anything else into memory.


An entity component system really is just SoA at its core.

Re: [Blog post] Why and when you should use SoA

Reply via email to