Re: [X10-users] Inquiring about how to properly x10's work-stealing scheduler
Hi, The purpose of our schedulers is to efficiently execute many activities with few threads. There is no need to programmatically pack logically separate tasks into few asyncs unless these are very small. There are runtime overheads involved in mapping many asyncs to few threads. The smaller the tasks, the higher the overhead. The default scheduler and the alternate scheduler triggered with the -WORK_STEALING compiler options have different trade-offs. With the default scheduler, the cost of an async is relatively high. In particular, it involves the creation of an heap-allocated object to represent the async. With the alternate scheduler, the whole application code is transformed. The resulting code is slower, but the cost of an async in the transformed code is much lower. The alternate scheduler is therefore able to handle efficiently much smaller asyncs than the default, but if the tasks are large and few, it ends up being slower than the default. In short, it makes sense to pack multiple tasks together (1 async) to minimize scheduling overheads. But this is much less of an issue with the -WORK_STEALING scheduler. On the other hand, too much packing a priori limits the scheduler's ability to load balance the execution (keep all threads busy at all time). Olivier Sparsh Mittal sparsh0mit...@gmail.com wrote on 04/29/2013 01:59:26 PM: From: Sparsh Mittal sparsh0mit...@gmail.com To: Mailing list for users of the X10 programming language x10- us...@lists.sourceforge.net Date: 04/29/2013 02:06 PM Subject: Re: [X10-users] Inquiring about how to properly x10's work- stealing scheduler I am trying to use the work-stealing scheduler, but could not find detailed documentation. If I have 16 tasks, which I want to parallelize on 4 threads, static scheduling would be: for i= 1 to 4 async(i, i*4, (i+1)*4) // do tasks i*4 to (i+1)*4-1 This assigns equal tasks to those threads. Now, if I have to use work-stealing scheduler: is no change required? Only flag has to be different? If the program itself needs to implement logic of work- stealing, then work-stealing scheduler itself will have nothing to do. Please let me know how should I implement this. I will be thankful for your help. Thanks and Regards Sparsh Mittal On Wed, Apr 17, 2013 at 2:56 PM, Olivier Tardieu tard...@us.ibm.com wrote: Hi, This option has not been properly maintained lately. So it may have problems when running with trunk. It was developed for X10 2.2 and was mostly tested with r22646. It has some limitations that we described in our PPoPP'12 paper A Work-Stealing Scheduler for X10?s Task Parallelism with Suspension. It also has a few bugs: it does not handle complicated loops properly and shortcut assignment operators. No change is required to the program. In fact, it is designed to deliver better performance than the default scheduler in the presence of many small tasks. With sequential programs or programs with few large tasks, it performs worse however. As usual, X10_NTHREADS needs to be set to the number of cores available to each place (no change). Olivier Sparsh Mittal sparsh0mit...@gmail.com wrote on 04/15/2013 10:26:31 AM: From: Sparsh Mittal sparsh0mit...@gmail.com To: x10-users@lists.sourceforge.net Date: 04/15/2013 10:35 AM Subject: [X10-users] Inquiring about how to properly x10's work- stealing scheduler Hello I saw: There is an experimental Cilk-style workstealing scheduler available in recent X10 releases (use x10c++ -WORK_STEALING=true to compile your program), however it is still in development and may have implementation limitations that prevent it from working on some input programs. I wanted to use this scheduler. I have a program which does say 10, 000 tasks. I have parallelized it using say 8 threads. Each task is independent and there is no communication required for solving any task. There is no dependency b/w tasks. Once my program is written, I can use x10c++ -WORK_STEALING=true Program.x10 I wanted to ask, does my program have to spawn 4 threads and distribute the tasks specifically individually to them and use - WORK_STEALING=true option; or is there any other way to use this work-stealing scheduler. In other words, how do I use this scheduler? I will be thankful for your reply. Thanks and Regards Sparsh Mittal -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ X10-users mailing list X10-users@lists.sourceforge.net https
Re: [X10-users] Inquiring about how to properly x10's work-stealing scheduler
I am trying to use the work-stealing scheduler, but could not find detailed documentation. If I have 16 tasks, which I want to parallelize on 4 threads, static scheduling would be: for i= 1 to 4 async(i, i*4, (i+1)*4) *// do tasks i*4 to (i+1)*4-1* This assigns equal tasks to those threads. Now, if I have to use work-stealing scheduler: is no change required? Only flag has to be different? If the program itself needs to implement logic of work-stealing, then work-stealing scheduler itself will have nothing to do. Please let me know how should I implement this. I will be thankful for your help. Thanks and Regards Sparsh Mittal On Wed, Apr 17, 2013 at 2:56 PM, Olivier Tardieu tard...@us.ibm.com wrote: Hi, This option has not been properly maintained lately. So it may have problems when running with trunk. It was developed for X10 2.2 and was mostly tested with r22646. It has some limitations that we described in our PPoPP'12 paper A Work-Stealing Scheduler for X10?s Task Parallelism with Suspension. It also has a few bugs: it does not handle complicated loops properly and shortcut assignment operators. No change is required to the program. In fact, it is designed to deliver better performance than the default scheduler in the presence of many small tasks. With sequential programs or programs with few large tasks, it performs worse however. As usual, X10_NTHREADS needs to be set to the number of cores available to each place (no change). Olivier Sparsh Mittal sparsh0mit...@gmail.com wrote on 04/15/2013 10:26:31 AM: From: Sparsh Mittal sparsh0mit...@gmail.com To: x10-users@lists.sourceforge.net Date: 04/15/2013 10:35 AM Subject: [X10-users] Inquiring about how to properly x10's work- stealing scheduler Hello I saw: There is an experimental Cilk-style workstealing scheduler available in recent X10 releases (use x10c++ -WORK_STEALING=true to compile your program), however it is still in development and may have implementation limitations that prevent it from working on some input programs. I wanted to use this scheduler. I have a program which does say 10, 000 tasks. I have parallelized it using say 8 threads. Each task is independent and there is no communication required for solving any task. There is no dependency b/w tasks. Once my program is written, I can use x10c++ -WORK_STEALING=true Program.x10 I wanted to ask, does my program have to spawn 4 threads and distribute the tasks specifically individually to them and use - WORK_STEALING=true option; or is there any other way to use this work-stealing scheduler. In other words, how do I use this scheduler? I will be thankful for your reply. Thanks and Regards Sparsh Mittal -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ X10-users mailing list X10-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/x10-users -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ X10-users mailing list X10-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/x10-users -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr___ X10-users mailing list X10-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/x10-users
Re: [X10-users] Inquiring about how to properly x10's work-stealing scheduler
Thanks a lot for your reply. Thanks and Regards Sparsh Mittal On Wed, Apr 17, 2013 at 2:56 PM, Olivier Tardieu tard...@us.ibm.com wrote: Hi, This option has not been properly maintained lately. So it may have problems when running with trunk. It was developed for X10 2.2 and was mostly tested with r22646. It has some limitations that we described in our PPoPP'12 paper A Work-Stealing Scheduler for X10?s Task Parallelism with Suspension. It also has a few bugs: it does not handle complicated loops properly and shortcut assignment operators. No change is required to the program. In fact, it is designed to deliver better performance than the default scheduler in the presence of many small tasks. With sequential programs or programs with few large tasks, it performs worse however. As usual, X10_NTHREADS needs to be set to the number of cores available to each place (no change). Olivier Sparsh Mittal sparsh0mit...@gmail.com wrote on 04/15/2013 10:26:31 AM: From: Sparsh Mittal sparsh0mit...@gmail.com To: x10-users@lists.sourceforge.net Date: 04/15/2013 10:35 AM Subject: [X10-users] Inquiring about how to properly x10's work- stealing scheduler Hello I saw: There is an experimental Cilk-style workstealing scheduler available in recent X10 releases (use x10c++ -WORK_STEALING=true to compile your program), however it is still in development and may have implementation limitations that prevent it from working on some input programs. I wanted to use this scheduler. I have a program which does say 10, 000 tasks. I have parallelized it using say 8 threads. Each task is independent and there is no communication required for solving any task. There is no dependency b/w tasks. Once my program is written, I can use x10c++ -WORK_STEALING=true Program.x10 I wanted to ask, does my program have to spawn 4 threads and distribute the tasks specifically individually to them and use - WORK_STEALING=true option; or is there any other way to use this work-stealing scheduler. In other words, how do I use this scheduler? I will be thankful for your reply. Thanks and Regards Sparsh Mittal -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ X10-users mailing list X10-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/x10-users -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ X10-users mailing list X10-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/x10-users -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter___ X10-users mailing list X10-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/x10-users