Re: [X10-users] Inquiring about how to properly x10's work-stealing scheduler

2013-04-30 Thread Olivier Tardieu
Hi,

The purpose of our schedulers is to efficiently execute many activities 
with few threads.
There is no need to programmatically pack logically separate tasks into 
few asyncs unless these are very small.
There are runtime overheads involved in mapping many asyncs to few 
threads.
The smaller the tasks, the higher the overhead.

The default scheduler and the alternate scheduler triggered with the 
-WORK_STEALING compiler options have different trade-offs.
With the default scheduler, the cost of an async is relatively high. In 
particular, it involves the creation of an heap-allocated object to 
represent the async.
With the alternate scheduler, the whole application code is transformed. 
The resulting code is slower, but the cost of an async in the transformed 
code is much lower.
The alternate scheduler is therefore able to handle efficiently much 
smaller asyncs than the default, but if the tasks are large and few, it 
ends up being slower than the default.

In short, it makes sense to pack multiple tasks together (1 async) to 
minimize scheduling overheads.
But this is much less of an issue with the -WORK_STEALING scheduler.

On the other hand, too much packing a priori limits the scheduler's 
ability to load balance the execution (keep all threads busy at all time).

Olivier


Sparsh Mittal sparsh0mit...@gmail.com wrote on 04/29/2013 01:59:26 PM:

 From: Sparsh Mittal sparsh0mit...@gmail.com
 To: Mailing list for users of the X10 programming language x10-
 us...@lists.sourceforge.net
 Date: 04/29/2013 02:06 PM
 Subject: Re: [X10-users] Inquiring about how to properly x10's work-
 stealing scheduler
 
 I am trying to use the work-stealing scheduler, but could not find 
 detailed documentation.
 
 If I have 16 tasks, which I want to parallelize on 4 threads, static
 scheduling would be:
 
 for i= 1 to 4
  async(i, i*4, (i+1)*4) // do tasks i*4 to (i+1)*4-1
 
 This assigns equal tasks to those threads. Now, if I have to use 
 work-stealing scheduler: is no change required? Only flag has to be 
 different? If the program itself needs to implement logic of work-
 stealing, then work-stealing scheduler itself will have nothing to 
 do. Please let me know how should I implement this. I will be 
 thankful for your help.
  

 Thanks and Regards
 Sparsh Mittal

 

 On Wed, Apr 17, 2013 at 2:56 PM, Olivier Tardieu tard...@us.ibm.com 
wrote:
 Hi,
 
 This option has not been properly maintained lately.
 So it may have problems when running with trunk.
 It was developed for X10 2.2 and was mostly tested with r22646.
 It has some limitations that we described in our PPoPP'12 paper A
 Work-Stealing Scheduler for X10?s Task Parallelism with Suspension.
 It also has a few bugs: it does not handle complicated loops properly 
and
 shortcut assignment operators.
 
 No change is required to the program.
 In fact, it is designed to deliver better performance than the default
 scheduler in the presence of many small tasks.
 With sequential programs or programs with few large tasks, it performs
 worse however.
 As usual, X10_NTHREADS needs to be set to the number of cores available 
to
 each place (no change).
 
 Olivier
 
 
 Sparsh Mittal sparsh0mit...@gmail.com wrote on 04/15/2013 10:26:31 AM:
 
  From: Sparsh Mittal sparsh0mit...@gmail.com
  To: x10-users@lists.sourceforge.net
  Date: 04/15/2013 10:35 AM
  Subject: [X10-users] Inquiring about how to properly x10's work-
  stealing scheduler
 
  Hello
 
  I saw:
 
  There is an experimental Cilk-style workstealing scheduler available
  in recent X10 releases (use x10c++ -WORK_STEALING=true to compile
  your program), however it is still in development and may have
  implementation limitations that prevent it from working on some
  input programs.
 
  I wanted to use this scheduler. I have a program which does say 10,
  000 tasks. I have parallelized it using say 8 threads.
 
   Each task is independent and there is no communication required for
  solving any task. There is no dependency b/w tasks. Once my program
  is written, I can use
  x10c++ -WORK_STEALING=true Program.x10
 
  I wanted to ask, does my program have to spawn 4 threads and
  distribute the tasks specifically  individually to them and use -
  WORK_STEALING=true option; or is there any other way to use this
  work-stealing scheduler.  In other words, how do I use this scheduler?
 
  I will be thankful for your reply.
 
  Thanks and Regards
  Sparsh Mittal
 
 
--
  Precog is a next-generation analytics platform capable of advanced
  analytics on semi-structured data. The platform includes APIs for
 building
  apps and a phenomenal toolset for data science. Developers can use
  our toolset for easy data analysis  visualization. Get a free 
account!
  http://www2.precog.com/precogplatform/slashdotnewsletter
  ___
  X10-users mailing list
  X10-users@lists.sourceforge.net
  https

Re: [X10-users] Inquiring about how to properly x10's work-stealing scheduler

2013-04-29 Thread Sparsh Mittal
I am trying to use the work-stealing scheduler, but could not find detailed
documentation.

If I have 16 tasks, which I want to parallelize on 4 threads, static
scheduling would be:

for i= 1 to 4
 async(i, i*4, (i+1)*4) *// do tasks i*4 to (i+1)*4-1*

This assigns equal tasks to those threads. Now, if I have to use
work-stealing scheduler: is no change required? Only flag has to be
different? If the program itself needs to implement logic of work-stealing,
then work-stealing scheduler itself will have nothing to do. Please let me
know how should I implement this. I will be thankful for your help.


Thanks and Regards
Sparsh Mittal



On Wed, Apr 17, 2013 at 2:56 PM, Olivier Tardieu tard...@us.ibm.com wrote:

 Hi,

 This option has not been properly maintained lately.
 So it may have problems when running with trunk.
 It was developed for X10 2.2 and was mostly tested with r22646.
 It has some limitations that we described in our PPoPP'12 paper A
 Work-Stealing Scheduler for X10?s Task Parallelism with Suspension.
 It also has a few bugs: it does not handle complicated loops properly and
 shortcut assignment operators.

 No change is required to the program.
 In fact, it is designed to deliver better performance than the default
 scheduler in the presence of many small tasks.
 With sequential programs or programs with few large tasks, it performs
 worse however.
 As usual, X10_NTHREADS needs to be set to the number of cores available to
 each place (no change).

 Olivier


 Sparsh Mittal sparsh0mit...@gmail.com wrote on 04/15/2013 10:26:31 AM:

  From: Sparsh Mittal sparsh0mit...@gmail.com
  To: x10-users@lists.sourceforge.net
  Date: 04/15/2013 10:35 AM
  Subject: [X10-users] Inquiring about how to properly x10's work-
  stealing scheduler
 
  Hello
 
  I saw:
 
  There is an experimental Cilk-style workstealing scheduler available
  in recent X10 releases (use x10c++ -WORK_STEALING=true to compile
  your program), however it is still in development and may have
  implementation limitations that prevent it from working on some
  input programs.
 
  I wanted to use this scheduler. I have a program which does say 10,
  000 tasks. I have parallelized it using say 8 threads.
 
   Each task is independent and there is no communication required for
  solving any task. There is no dependency b/w tasks. Once my program
  is written, I can use
  x10c++ -WORK_STEALING=true Program.x10
 
  I wanted to ask, does my program have to spawn 4 threads and
  distribute the tasks specifically  individually to them and use -
  WORK_STEALING=true option; or is there any other way to use this
  work-stealing scheduler.  In other words, how do I use this scheduler?
 
  I will be thankful for your reply.

  Thanks and Regards
  Sparsh Mittal
 

 --
  Precog is a next-generation analytics platform capable of advanced
  analytics on semi-structured data. The platform includes APIs for
 building
  apps and a phenomenal toolset for data science. Developers can use
  our toolset for easy data analysis  visualization. Get a free account!
  http://www2.precog.com/precogplatform/slashdotnewsletter
  ___
  X10-users mailing list
  X10-users@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/x10-users



 --
 Precog is a next-generation analytics platform capable of advanced
 analytics on semi-structured data. The platform includes APIs for building
 apps and a phenomenal toolset for data science. Developers can use
 our toolset for easy data analysis  visualization. Get a free account!
 http://www2.precog.com/precogplatform/slashdotnewsletter
 ___
 X10-users mailing list
 X10-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/x10-users

--
Try New Relic Now  We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app,  servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr___
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users


Re: [X10-users] Inquiring about how to properly x10's work-stealing scheduler

2013-04-22 Thread Sparsh Mittal
Thanks a lot for your reply.

Thanks and Regards
Sparsh Mittal



On Wed, Apr 17, 2013 at 2:56 PM, Olivier Tardieu tard...@us.ibm.com wrote:

 Hi,

 This option has not been properly maintained lately.
 So it may have problems when running with trunk.
 It was developed for X10 2.2 and was mostly tested with r22646.
 It has some limitations that we described in our PPoPP'12 paper A
 Work-Stealing Scheduler for X10?s Task Parallelism with Suspension.
 It also has a few bugs: it does not handle complicated loops properly and
 shortcut assignment operators.

 No change is required to the program.
 In fact, it is designed to deliver better performance than the default
 scheduler in the presence of many small tasks.
 With sequential programs or programs with few large tasks, it performs
 worse however.
 As usual, X10_NTHREADS needs to be set to the number of cores available to
 each place (no change).

 Olivier


 Sparsh Mittal sparsh0mit...@gmail.com wrote on 04/15/2013 10:26:31 AM:

  From: Sparsh Mittal sparsh0mit...@gmail.com
  To: x10-users@lists.sourceforge.net
  Date: 04/15/2013 10:35 AM
  Subject: [X10-users] Inquiring about how to properly x10's work-
  stealing scheduler
 
  Hello
 
  I saw:
 
  There is an experimental Cilk-style workstealing scheduler available
  in recent X10 releases (use x10c++ -WORK_STEALING=true to compile
  your program), however it is still in development and may have
  implementation limitations that prevent it from working on some
  input programs.
 
  I wanted to use this scheduler. I have a program which does say 10,
  000 tasks. I have parallelized it using say 8 threads.
 
   Each task is independent and there is no communication required for
  solving any task. There is no dependency b/w tasks. Once my program
  is written, I can use
  x10c++ -WORK_STEALING=true Program.x10
 
  I wanted to ask, does my program have to spawn 4 threads and
  distribute the tasks specifically  individually to them and use -
  WORK_STEALING=true option; or is there any other way to use this
  work-stealing scheduler.  In other words, how do I use this scheduler?
 
  I will be thankful for your reply.

  Thanks and Regards
  Sparsh Mittal
 

 --
  Precog is a next-generation analytics platform capable of advanced
  analytics on semi-structured data. The platform includes APIs for
 building
  apps and a phenomenal toolset for data science. Developers can use
  our toolset for easy data analysis  visualization. Get a free account!
  http://www2.precog.com/precogplatform/slashdotnewsletter
  ___
  X10-users mailing list
  X10-users@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/x10-users



 --
 Precog is a next-generation analytics platform capable of advanced
 analytics on semi-structured data. The platform includes APIs for building
 apps and a phenomenal toolset for data science. Developers can use
 our toolset for easy data analysis  visualization. Get a free account!
 http://www2.precog.com/precogplatform/slashdotnewsletter
 ___
 X10-users mailing list
 X10-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/x10-users

--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter___
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users