Updates:
Cc: [email protected] [email protected]
Comment #3 on issue 30177 by [email protected]: Crash in TimeTicks::Now()
http://code.google.com/p/chromium/issues/detail?id=30177
Summary: I've very concerned about the at_exit.cc code, and apparent races,
as
described below.
Note than these 3 stacks appear to be a crashes in a debug build, as
demonstrated by
the significant presence of tracked_.cc. This is visible as the tracked.cc
lines are
only enabled if the build is tracking all objects... which means it is a
debug build
(or a special build to enable this).
Hence these specific crashes will have no impact on a production
release.... but it
would be nice if it did not crash.
that said...
FWIW: The assertion is firing because Very Bad Things have happened in the
locking
system. In one case, the assertion is stating that a shadow count, which
shows how
many folks hold the locks, indicates that the lock is not held, but yet we
are about
to do a release of the lock (ut oh). In the other case, the assert is
saying that
the thread that is about to do a release is not the thread holding the lock.
In both cases the lock of interest is the lock created for use by the
singleton class
to protect access to the at_exit queue (probably the list of callbacks to
be done at
exit to destroy all the created singletons).
My guess would be that the lock instance that is being referenced has been
destroyed,
and hence there is no longer a lock instance, and the shadow count and
owning_thread
members have been corrupted (after being freed???).
The lock in question is a member of an AtExitManager instance.
I tried to study the code of AtExitManager, and there are a number of
suspicious
elements :-/. Perhaps the context of all these calls is that the system is
running
single threaded... but then I'd have no idea why we are locking... so I
have to
assume we are running multi-threaded.
a) The globally visible g_top_manager is not thread safe to access. As a
result, one
thread can access g_top_manager, and then be preempted, while another
thread can then
access it, process all its events, and destroy it :-/. The first thread
would then
assert out as it tried to Acquire the lock. Similarly, one thread can
access
g_top_manager, process all its callbacks, release the lock, and then be
pre-empted.
At that point, a second thread can access g_top_manager, acquire the lock,
and do a
push, and then be pre-empted. Finally, the first thread (not holding the
lock) can
complete the destruction of the instance (and lock), setting up for the
crash, as the
second thread can then attempt to release the lock (crash stacks as shown
above would
result).
b) The ProcessCallback method uses locks in a very questionable way. IMO,
it should
pop something from the list under a lock, and then release the lock before
executing
the callbacks. It is generally a bad idea to hold a lock for longer than
needed, and
especially bad to hold a lock while making a second unknown call, with
unknown side-
effects, and unknown lock dependencies.
c) The whole idea of snagging the global, and then looking at the lock, is a
questionable practice, as the state of the global may change before the
lock is
acquired. I'm not clear on why we need to "push" a series of these
AtExitManager,
but I suspect the activity is fraught with (race) perils. For instance,
after all
callbacks have been processed, and the lock released, and after the
ProcessCallbacks
has exited, a new process can be added to the stack, which will never be
run.
I have ideas for resolving these issues, but perhaps I'm misunderstanding
the usage
context, and wasting time. I'll wait for additional comments before
suggestion
plausible resolutions.
--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings
--
Automated mail from issue updates at http://crbug.com/
Subscription options: http://groups.google.com/group/chromium-bugs