On Tuesday, 24 September 2013 at 17:02:18 UTC, Andrei
Alexandrescu wrote:
On 9/24/13 9:58 AM, Peter Alexander wrote:
On Tuesday, 24 September 2013 at 15:25:11 UTC, Andrei
Alexandrescu wrote:
What are they paying exactly? An extra arg to allocate that
can probably
be defaulted?
void[] allocate(size_t bytes, size_t align =
this.alignment) shared;
For allocating relatively small objects (say up to 32K),
we're looking
at tens of cycles, no more. An extra argument needs to be
passed
around and more importantly looked at and acted upon. At this
level
it's a serious dent in the time budget.
The cost of a few cycles really doesn't matter for memory
allocation...
If you are really allocating memory so frequently that those
few extra
cycles matter then you are probably going to be memory bound
anyway.
It does. I'm not even going to argue this.
Sorry but I find this insulting. Myself and Manu, both
professional and senior game developers with a lot of experience
in performance are both arguing against you. I'm not saying this
makes us automatically right, but I think it's rude to dismiss
our concerns as not even worthy of discussion.
I think this is a situation where you need to justify yourself
with
something concrete. Can you provide an example of some code
whose
performance is significantly impacted by the addition of an
alignment
parameter? It has to be "real code" that does something
useful, not just
a loop the continually calls allocate.
Strings.
Strings what? Just allocating lots of small strings?
Ok, I've put together a benchmark of the simplest allocator I can
think of (pointer bump) doing *nothing* but allocating 12 bytes
at a time and copying a pre-defined string into the allocated
memory: http://dpaste.dzfl.pl/59636d82
On my machine, the difference between the version with alignment
and the version without 1%. I tried changing the allocator to a
class so that the allocation was virtual and not inlined, and the
difference was still only ~2% (Yes, I verified in the generated
code that nothing was being omitted).
In a real scenario, much more will be going on outside the
allocator, making the overhead much less than 1%.
Please let me know if you take issue with the benchmark. I wrote
this quickly so hopefully I have not made any mistakes.