On Tue, 15 Dec 2009 12:23:01 -0500, dsimcha <dsim...@yahoo.com> wrote:
== Quote from Dan (dsstruth...@yahoo.com)'s article
My code does do considerable array appending, and I see exactly the
same issue
as dsimcha points out above. I would expect it is GC related, but why
for
multiple cores only, I cannot fathom.
Thanks for the repro, dsimcha! My code snippet would not have been as
straight-forward.
Two reasons:
1. When GC info is queried, the last query is cached for (relatively)
efficient
array appending. However, this cache is not thread-local. Therefore,
if you're
appending to arrays in two different threads simultaneously, they'll
both keep
evicting each other's cached GC info, causing it to have to be looked up
again and
again.
2. Every array append requires a lock acquisition. This is much more
expensive
if there's contention.
Bottom line: Array appending in multithreaded code is **horribly**
broken and I'm
glad it's being brought to people's attention. For a temporary fix,
pre-allocate
or use std.array.Appender.
Yes, but why does multiple cores make the problem worse? If it's the
lock, then I'd expect just locking in multiple threads without any
appending does worse on multiple cores than on a single core. If it's the
lookup, why does it take longer to lookup on multiple cores? The very
idea that multiple cores makes threading code *slower* goes against
everything I've ever heard about multi-core and threads.
I agree array appending for multithreaded code is not as efficient as it
is when you use a dedicated append-friendly object, but it's a compromise
between efficiency of appending (which arguably is not that common) and
efficiency for everything else (slicing, passing to a function, etc). I
expect that very soon we will have efficient appending with thread local
caching of lookups, but the multi-core thing really puzzles me.
-Steve