On Friday, 23 May 2014 at 13:43:53 UTC, Chris wrote:
On Friday, 23 May 2014 at 06:17:43 UTC, Rainer Schuetze wrote:
On 22.05.2014 21:04, Etienne wrote:
On 2014-05-22 2:12 PM, Rainer Schuetze wrote:
"NO_INTERIOR" is currently only used for the hash array used
by
associative arrays. It is a bit dangerous to use as any
pointer,slice or
register still operating on the array is ignored, so
collecting it might
corrupt your memory.
That's quite a relief, I was afraid of having to do it ;)
I'm currently exploring the possibility of sampling the
pointers during
mark'ing to check if they're gone and using bayesian
probabilities to
decide whether or not to skip the pool.
I explained it all here:
https://github.com/D-Programming-Language/druntime/pull/797#issuecomment-43896016
-- paste --
Basically, when marking, you take 1 in X of the references
and send them
to a specific array that represents the pool they refer to.
Then, next
time you're going to collect you test them individually and
if they're
mostly there, you skip marking/free'ing for that particular
pool during
collection. You can force collection on certain pools every 1
in X
collections to even out the average lifetime of the
references.
You're going to want to have a lower certainty of failure for
big
allocations, but basically you're using probabilities to
avoid pushing a
lot of useless load on the processor, especially when you're
in a part
of an application that's just allocating a lot (sampling will
determine
that the software is not in a state of data removal).
http://en.wikipedia.org/wiki/Bayes_factor
-- end paste --
The bayes factor is merely there to choose the appropriate
model that
fits with the program. Bayesian inference would take care of
deciding if
a pool should end up being mark'ed. In other words, machine
learning.
Would you think it'd be a good optimization opportunity?
Hmm, I guess I don't get the idea. You cannot skip a pool
based on some statistics, you might have references in there
to anything. As a result you cannot collect anything.
I'm not a fan of machine learning, especially not in cases you
_can_ control, like memory allocation / deallocation. Guessing
is not a good strategy, if you can have control over something.
Machine learning is only good for vast and unpredictable data
(voice recognition for example). Then it makes sense to apply
probability. But if you can actually control what you are
doing, why would you want to rely on a stupid and blind machine
that decides things for you based on a probability of n%?
Things can go wrong and you don't even know why. Mind you, we
should rule the machines, not the other way around.
Bear in mind here that most code goes though a whole bunch of
machine learning algorithms in the CPU itself. Like it or not, it
has proved extremely successful.