I decided to start a new thread for this as the original one seems to have forked multiple times.
This thread is related to https://github.com/Test-More/test-more/issues/589 alone. David and I have both put some time into reproducing this, and I want to put out a summary of my findings (I will leave it to David to summarize his, if he cares to). I was able to boil the issue down to a fairly simple test-script: https://gist.github.com/exodist/ef08a3c406d98c1bb618 which is able to reproduce the problem sometimes. XDG has a slightly different version, but in both cases the key part is this: ok(1); > my ($index) = grep { $_->{name} eq 'foo' } $coll->get_indexes; > ok(1); The first ok is used to get some state into the system, if the state is not there the problem does not occur, or occurs too infrequently to measure. Ending the test right after the first ok also never seems to reproduce the issue. The second line gets a variable generated deep inside the mongodb code, and I have been treating it like a black box. The issue never occurs if we get rid of this line, or modify it so that $index is cleared before the second ok() line. The second ok() seems to trigger the bug, to be specific we know that the scalar with the refcount mentioned in the warning is the one being created and blessed here: https://github.com/Test-More/test-more/blob/stream/master/lib/Test/Stream/Context.pm#L263 Knowing that does not help us much. This same code runs with no problems if $index is destroyed before it runs. Making any significant changed to this method, or the methods leading up to it in the stack seem to effect the frequency at which the bug occurs, but will not eliminate it completely (This initially had us following a red herring where localizing $_ seemed to remove the problem, but that ultimately turned out to be incorrect). So far we have only been able to reproduce this with any reliability on darwin, using perls 5.20.2 and 5.20.1. Alh managed to reproduce it at least once on linux, but it is not frequent enough to really dig in. 5.18.4 and earlier do not appear to reproduce the issue at all. I have not been able to try 5.20.0 or blead, and I do not expect to have the resources to do so in the near-term (I do not have a darwin box, and the one I have been using is being used for other things by its owner). There are many other things that can be done to alter the frequency, in some cases enough so to make the problem appear solved, but ultimately it seems that the more state you have built up and stored, the more likely the issue is to occur. At this point I am unable to continue further on this investigation, so I am writing up this summary for others to read and comment on, and possibly continue the investigation. -Chad