Hi Brad, Soumen --

I don't know how much time might elapse between when the Chapel runtime
requests the creation of a system pthread (for Chapel task hosting) and
when that pthread becomes useful, but if it's long enough and the bodies
of the forall loops are small enough, it could be that the forall loops
are done before any new pthreads arrive to help.  If so, then increasing
the iteration count from 2048 to a million or more might be enough to
ensure that actual parallelism occurs.

Assuming the default fifo tasking layer is being used, any system
pthreads created to host Chapel tasks will continue running, and looking
for work, after those tasks have completed.  So if we create pthreads
for the tasks in the forall, those pthreads will continue to exist until
the program terminates.  This would explain why the cores are busy after
the writeln().

Soumen, if you do

  $CHPL_HOME/util/printchplenv

what is the CHPL_TASKS setting it reports?

greg


On Wed, 29 Jan 2014, Brad Chamberlain wrote:

Hi Soumen --
Unless I'm missing something (and I don't think I am), by default, the two 
forall loops ought to use
four tasks/threads without any effort on your part (i.e., no special flags or 
switches or anything).
 The for loop and the writeln() would only use 1 task/thread.  Depending on 
what your functions are,
it may be that these are short-enough running threads that the OS doesn't move 
them around to use all
four cores, but that would surprise me slightly (in particular, I'd think that 
the threads would
spread out pretty quickly, if not be created in a spread-out manner).  

That leads me to ask: What technique are you using to determine whether or not 
four cores are being
used?

I'll note that we have some current work going on to better map specific tasks 
to specific numa
domains within a node, a prototype of which was released in the 1.8.0 release, 
but I don't think that
should be necessary to end up using all of your cores.

Thanks,
-Brad


_____________________________________________________________________________________________________
From: Soumen [[email protected]]
Sent: Wednesday, January 29, 2014 6:15 AM
To: [email protected]
Subject: forall loop not using all cores.

Hi,

I have chapel 1.8.0 installed with default settings in my desktop having config:
OS: Ubuntu 12.04.04
Ram: 4gb
Processor: i5 2320 @ 3.00 Ghz x 4

The code I am trying to run is:

var d : domain(1) int = {1..2048};
var A : [d] int;

forall i in d {
      doSomethingOnAElements(i);               
}

for i in 1..40 {
     doSomething();
}

forall i in d {
      doSomethingOnAElements(i);               
}

writeln(A);


The problem is till writeln(A); the code is running only on single core. But 
after displaying A
chapel uses all the four cores for a period more than it took to display A.

The problem is same if coforall is used or for is used in substitute to forall.

So how can I make chapel use all the four cores every time?

Below are options(all default) used for chapel:
 
Parallelism Control Options:
      --[no-]local                    Target one [many] locale[s]
          currently: --local

      --[no-]serial                   [Don't] Serialize parallel constructs
          currently: --no-serial

      --[no-]serial-forall            [Don't] Serialize forall constructs
          currently: --no-serial-forall

Optimization Control Options:
      --fast                          Use fast default settings
          currently: not selected

      --[no-]fast-followers           Enable [disable] fast followers
          currently: --fast-followers

        --[no-]optimize-loop-iterators  Enable [disable] optimization of
                                      iterators composed of a single loop
          currently: --optimize-loop-iterators

Waiting for reply.

Soumen



------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Chapel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-users

Reply via email to