A couple of things:

1.  This only happens with pthreads (ie it doesn't happen with TBB).

2.  I can confirm that it is the same symptom on the Intel Phi card.

3.  I can confirm that it is alleviated by commenting out those
scoped_lock lines (for both machines).

Pthread locks work by being 0 in their unlocked state and anything
else when they are locked.  I can only guess that the constructor for
those mutexes hasn't yet been called to set the initial value of the
lock... therefore it's waiting there.

I won't have access to either of those machines to do more testing
until next week.

For now, I vote for removing those locks as they are unnecessary.

BTW - we're not creating a global LibmeshInit.  It gets created in
main like normal.

Derek




Sent from my iPad

On Aug 16, 2013, at 8:13 PM, Roy Stogner <[email protected]> wrote:

>
> On Fri, 16 Aug 2013, Derek Gaston wrote:
>
>> We're seeing hard locks on some machines in Singleton::Setup::Setup()!
>> The problem is that it's trying to create a scoped_lock using a mutex that 
>> is defined in that file.
>> Apparently that mutex is not guaranteed to have been initialized at the 
>> point where we're calling that function (or
>> something) and it is just hanging while trying to acquire that lock!
>
> Hmm... remote_elem_mtx should only get constructed at static
> initialization time before main() gets called, and
> RemoteElem::create() should only get called from
> LibMeshInit::LibMeshInit() afterwards.
>
> You're not creating a global LibMeshInit object, are you?
>
>> I commented out that scoped_lock line and then the binary runs just fine.
>
> Hmmm... would you replace that global mutex with two locals?  Maybe
> there's some problem with a mutex constructor being called before we
> init TBB?
>
>> Why do we need to lock in those functions? �Surely the
>> Singleton::Setup stuff is NOT going to get called in a loop.
>
> You're right; the Setup constructor should be called at static init
> time and the setup() call should be at LibMeshInit constructor time.
>
>> How do we want to proceed?
>
> It looks like we've got redundant locks that we can safely get rid
> of... but I'd like to actually *understand* the problem too, and that
> hasn't happened for me yet.
> ---
> Roy

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
Libmesh-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libmesh-devel

Reply via email to