Steven Schveighoffer wrote: > I'll point out a couple of issues here: > > 1. you are never removing elements from the array when version(calldel) is > not enabled. That is, the GC might run, but does not clean any objects > when you just do top--. Although this might not make a huge difference, > it is not an equivalent comparison. I'd suggest adding else clauses to > the version(calldel) where you set a[top] to null.
Whoops, good catch, this is a bug. I get 1m27 when reassigning a[top] to null, which is a little bit faster. > 2. the GC is conservative, and you are allocating swaths of 1K objects. > It's quite possible that you are running into false pointer issues. This > might be helped with a more precise GC (which there is a patch for in > bugzilla). David Simcha has also checked into github improvements for > allocation of large objects. This may also help with your benchmark. The data=void can be removed, but it actually slows down the benchmark a little bit on my machine. > > I too have seen the GC perform horribly in situations like this, where you > are allocating lots and lots of data. All you need to do to prove it is > append an array for so many elements, you will see huge pauses where the > GC runs. > > These are situations where delete helps because of the conservative > implementation of the GC. Yes, it is a valid point, but it doesn't prove > that delete is better against *any* GC. What we need to do is improve the > GC > > I tried setting a[top] to null in the GC system. For comparison, on my > system your original GC test took 2m42s. > > With setting a[top] to null, the new run on my system is 2m36s. So the > time does not decrease significantly. > > Some more data: > > I added the following line to the foreach loop: > > if(!(t % 10000)) writeln(t, "\t", top); > > There is some interesting behavior, particularly for the GC and GC_hints > versions. Your code obviously goes through cycles in the array, first > growing up to 100,000, then shrinking back down to 0, then repeating the > cycle. There is some noise with the rand % 3, but it is not enough to > make the growth and shrinking truly random. What is interesting in this > display is, in both GC_hints and GC versions, the initial growth to > 100,000 is somewhat slow, with the GC_hints version being a bit faster. The growing and shrinking is intended to work exactly that way, it makes the allocations more, say, dynamic. ;) If it was truly random, the memory usage would be kept more or less constant. > > However, after that, the difference is drastic. The GC_hints version > shrinks very quickly, and the GC version continues its slow pace. This is > mildly surprising, because setting the top element to null should be much > faster than delete, and both are using the GC to allocate new nodes. > However, what I find really interesting is the subsequent growth back up > to 100,000 is still slow in the GC version, but is still very speedy in > the GC_hints version. Given that both are using the GC for allocation, I > would expect even footing. There is definitely something different going > on when delete is called than when a collection cycle is called. The GC > version may leave more objects allocated than the GC_hints version, but it > should speed up as it can collect more object blocks to reuse. It's > almost as if it's not reusing memory, but my system isn't running out of > memory. So I'm not sure exactly what's going on. > > This clearly needs some investigation, thank you for sharing your > benchmark. > > -Steve
