runtime questions

Greg Kreider Wed, 27 May 2015 07:01:53 -0700

Good morning.

A few questions about the runtime -


1. Is there any way to predict how a program's tasks will be
    executed on a given piece of hardware?  We've seen the
    performance of programs change depending on variations in
    the implementation and don't understand why it happens.  A
    few examples.  One program has an outer loop and calls a
    couple subroutines that use forall's, others that use for.
    If we use a for in the outer loop then one core on the CPU
    is loaded 100% and the other 3 at 40%.  Switching to a
    forall or coforall, everything runs serially on one core.
    Another program using cobegin never ran on more than two
    cores.  This is using qthread + hwloc.  Using the quickstart
    compiler programs would saturate all the cores.

    You can get information about when tasks launch; is there
    some way to understand why the runtime is making the choices
    it does for this behavior?  It would be better to make
    intelligent design decisions rather than randomly swapping
    for, forall, and coforall to see the effect - it's all a bit
    of a black box.

2. Does a 'for param' unroll within a cobegin?  One program
    only ran serially with this setup, while a coforall over the
    parameter ran in parallel.  But maybe this is related to the
    first question.
    The code skeleton looked like
       cobegin {
         for param bank in 1..nbank do run_filter(bank);
         /* process result */
       }
    vs.
       coforall bank in 1..nbank {
         run_filter(bank);
        /* process result */
       }
    where the number of banks was small, about 20, but too many
    to write out.

3. Some errors are trapped by the runtime, others are not and
    just exit with a short message to the console.  Examples
    include segfaults and floating point exceptions.  Is it
    possible to print the line number where the error occurred
    (as staring at the code waiting for enlightenment isn't the
    fastest way to finding the problem)?  Or, what way do you
    recommend to debug problems like this?

4. We have one program that runs for a small number of iterations
    but dies with a slightly larger (300 trials instead of 100).
    The load on the CPU seems normal, but it will just stop with a
    succinct message on the console: "Killed".  How do we find out
    what's causing the problem?

Thanks again for the help,

Greg

------------------------------------------------------------------------------
_______________________________________________
Chapel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-users

runtime questions

Reply via email to