Wolfgang,
> it's exactly the corner cases that are difficult to get right and for which
> the completion_mutex was meant to exist. For example, what happens if more
> than one thread call Task::join? In that case, you call wait_for_all more
> than once. The mutex avoids this.
Currently we don't have this kind of corner case anywhere in the
library: One needs to pass a Task to a subtask which then calls join().
See attached test based on tests/base/task_04.cc. If I get you right,
such a test should work, or what do you think? But it does not for my
system with the unmodified thread_management.h (and not our modification
from yesterday either). Wolfgang, how is it on your system?
This cannot be solved with the patch I had in mind: TBB does not allow
another thread to call wait_for_all other than the thread that spawned
the job, so we'll need to come up with something else.
> The way it is intended to work is of course that the worker task continues to
> run and at the end releases the mutex, which will then wake up the waiting
> tasks one-by-one. Are you suggesting that the waiting threads prevent the
> worker task from working?
To be honest, I don't really have a clue what exactly happens. When I
run step-12, it gets stuck before any output is produced. gdb reports
that all the threads try to acquire a completion_mutex: thread 1 gets
stuck in line 810 in fe_tools.cc, and all the other threads in line 768
(where the subtasks are joined). So it seems that the subtasks that
should release the mutex either do not release correctly or the waiting
thread does not notice when it is released. It only is a problem when
the tasks are nested. This is all I can see in gdb. I don't know whether
one can look into the state of the actual tasks in more detail with gdb
or some other tool.
If I run valgrind's drd (thread checker), I identify many of the errors
as uncritical (since they are inside TBB), but one thing makes me
suspicious:
==30865== Thread 3:
==30865== Recursive locking not allowed: mutex 0x16b677a0, recursion
count 1, owner 3.
==30865== at 0x4C261AF: pthread_mutex_lock
(drd_pthread_intercepts.c:584)
==30865== by 0x4A614F: __gthread_mutex_lock(pthread_mutex_t*)
(gthr-default.h:758)
==30865== by 0x4AC59F: std::mutex::lock() (mutex:88)
==30865== by 0x4AC601: dealii::Threads::Mutex::acquire()
(thread_management.h:390)
==30865== by 0x4D139C:
dealii::Threads::internal::TaskDescriptor<>::join()
(thread_management.h:3981)
==30865== by 0x4CD21F: dealii::Threads::Task<>::join() const
(thread_management.h:4089)
==30865== by 0x4C8531: dealii::Threads::TaskGroup<>::join_all() const
(thread_management.h:5116)
==30865== by 0x832318E: void dealii::FETools::(anonymous
namespace)::compute_embedding_matrices_for_refinement_case<3, double,
3>(dealii::FiniteElement<3, 3> const&,
std::vector<dealii::FullMatrix<double>,
std::allocator<dealii::FullMatrix<double> > >&, unsigned int)
(fe_tools.cc:768)
...
==30865== mutex 0x16b677a0 was first observed at:
==30865== at 0x4C261AF: pthread_mutex_lock
(drd_pthread_intercepts.c:584)
==30865== by 0x4A614F: __gthread_mutex_lock(pthread_mutex_t*)
(gthr-default.h:758)
==30865== by 0x4AC59F: std::mutex::lock() (mutex:88)
==30865== by 0x4AC601: dealii::Threads::Mutex::acquire()
(thread_management.h:390)
==30865== by 0x4D60D8:
dealii::Threads::internal::TaskDescriptor<>::queue_task()
(thread_management.h:3900)
==30865== by 0x4D11A0: dealii::Threads::Task<>::Task(std::function<>
const&) (thread_management.h:4060)
...
==30865== by 0x8323142: void dealii::FETools::(anonymous
namespace)::compute_embedding_matrices_for_refinement_case<3, double,
3>(dealii::FiniteElement<3, 3> const&,
std::vector<dealii::FullMatrix<double>,
std::allocator<dealii::FullMatrix<double> > >&, unsigned int)
(fe_tools.cc:764)
...
This error message says that we try to recursively acquire the same
mutex (on the _same_ thread): We first acquire it in
TaskDescriptor::queue_task (line 3900 in thread_management.h), and then
try to acquire it again when we wait for the child to finish in
TaskDescriptor::join (line 3981) in order to know whether the child has
released it. As far as I understand things, it is system-dependent
whether the same process can acquire a lock that it already holds and
might even give unpredictable results (as I see them here). This is
opposed to _other_ threads trying to acquire the mutex, which is the
usual case and why one wants to use mutexes in the first place.
Wolfgang, do you see a solution for this?
Best,
Martin
//-----------------------------------------------------------------------------
// $Id: task_04.cc 18850 2009-05-15 18:38:02Z bangerth $
// Version: $Name$
//
// Copyright (C) 2009 by the deal.II authors
//
// This file is subject to QPL and may not be distributed
// without copyright and license information. Please refer
// to the file deal.II/doc/license.html for the text and
// further information on this license.
//
//-----------------------------------------------------------------------------
// start tasks from tasks
#include <iomanip>
#include <fstream>
#include <unistd.h>
#include <base/config.h>
#include <base/job_identifier.h>
#include <base/logstream.h>
#include <base/exceptions.h>
#include <base/utilities.h>
#include <base/thread_management.h>
using namespace dealii;
void test (int i)
{
deallog << "Task " << i << " starting..." << std::endl;
if (i < 10)
{
Threads::Task<> t1 = Threads::new_task (test, 10*i+1);
Threads::Task<> t2 = Threads::new_task (test, 10*i+2);
t1.join ();
t2.join ();
}
sleep (1);
deallog << "Task " << i << " finished!" << std::endl;
}
void test_t (int i, Threads::Task<> &t0)
{
deallog << "Task " << i << " starting..." << std::endl;
if (i < 10)
{
t0.join();
Threads::Task<> t1 = Threads::new_task (test, 10*i+1);
Threads::Task<> t2 = Threads::new_task (test, 10*i+2);
t1.join ();
t2.join ();
}
sleep (1);
deallog << "Task " << i << " finished!" << std::endl;
}
int main()
{
deallog.threshold_double(1.e-10);
Threads::Task<> t1 = Threads::new_task (test, 1);
Threads::Task<> t2 = Threads::new_task (test, 2);
Threads::Task<> t3 = Threads::new_task (test_t, 3, t1);
t1.join ();
std::cout << "Finished task 1: " << std::endl;
t2.join ();
t3.join ();
deallog << "OK" << std::endl;
}
_______________________________________________
dealii mailing list http://poisson.dealii.org/mailman/listinfo/dealii