Re: [Chapel-developers] qthreads performance

Greg Titus Tue, 15 Aug 2017 15:36:05 -0700

Hello Hui --

I think you already are risking correctness for performance.  :-)


One of the side effects of throwing '--fast' at compile time is to disable 
guard pages by default at execution time.  Plus, while fifo and qthreads 
tasking don’t behave identically with respect to guard pages, they do both 
support them and just a difference in guard page setting wouldn’t cause a huge 
difference in performance between the two unless the app or benchmark did a lot 
of task creation.  So, I don’t think that’s it. 

A further question, then: were debug and optimize settings the same for both 
your runtime builds (tasks=fifo and =qthreads), and for the third-party package 
builds for qthreads and hwloc?

greg


> On Aug 15, 2017, at 3:27 PM, Hui Zhang <[email protected]> wrote:
> 
> Thanks, Aji,
> 
> I've verified that it was configured with "--enable-guard-pages" by default 
> when I built Chapel. But is that necessary for Chapel? Any thoughts from 
> Chapel? I don't want to take the risk of correctness for performance.
> Thanks
> 
> On Tue, Aug 15, 2017 at 4:50 PM, Aji, Ashwin <[email protected]> wrote:
> These are my 2 cents on measuring qthreads performance before in Chapel. If 
> you configured qthreads with “--enable-guard-pages”, then the performance 
> will be much slower than without enabling guard pages. It may be worthwhile 
> to see how you have configured qthreads.
> 
>  
> 
> Regards,
> 
> Ashwin
> 
>  
> 
> From: Hui Zhang [mailto:[email protected]] 
> Sent: Tuesday, August 15, 2017 11:50 AM
> To: Greg Titus <[email protected]>
> Cc: Chapel Sourceforge Developers List 
> <[email protected]>
> Subject: Re: [Chapel-developers] qthreads performance
> 
>  
> 
> Hi, Greg
> 
>  
> 
> On Tue, Aug 15, 2017 at 1:35 PM, Greg Titus <[email protected]> wrote:
> 
> Hello Hui --
> 
> Generally CHPL_TASKS=qthreads outperforms CHPL_TASKS=fifo at all but the 
> smallest scales.  We would need to know a lot more to come to any worthwhile 
> conclusions.  What is the output of `printchplenv --anonymize` for your 
> configurations (I assume they differ  only in terms of the CHPL_TASKS 
> setting)? 
> 
> 
> 
> CHPL_TARGET_PLATFORM: linux64
> 
> CHPL_TARGET_COMPILER: gnu
> 
> CHPL_TARGET_ARCH: native *
> 
> CHPL_LOCALE_MODEL: flat
> 
> CHPL_COMM: gasnet *
> 
>   CHPL_COMM_SUBSTRATE: ibv *
> 
>   CHPL_GASNET_SEGMENT: large
> 
> CHPL_TASKS: qthreads
> 
> CHPL_LAUNCHER: gasnetrun_ibv *
> 
> CHPL_TIMERS: generic
> 
> CHPL_UNWIND: none
> 
> CHPL_MEM: jemalloc
> 
> CHPL_MAKE: gmake
> 
> CHPL_ATOMICS: intrinsics
> 
>   CHPL_NETWORK_ATOMICS: none
> 
> CHPL_GMP: gmp
> 
> CHPL_HWLOC: hwloc
> 
> CHPL_REGEXP: re2
> 
> CHPL_WIDE_POINTERS: struct
> 
> CHPL_AUX_FILESYS: none
> 
> Yes, the only difference is CHPL_TASKS.
> 
>  
> 
> Are you using any compilation options other than ‘--fast’?  What execution 
> options are you using? 
> 
> For hpl:  --n=500 --printArray=false --printStacts=true 
> --useRandomSeed=false -nl *
> 
> For lulesh: 
> 
>  
> 
> --filename=lmeshes/sedov15oct.lmesh -nl *
> 
> For isx:  --nide-weakISO --n=5592400 --numTrials=10 -nl *
> 
> Are you setting any execution-time environment variables (CHPL_RT_*) and if 
> so, to what values?   
> 
> NO
> 
>  
> 
> And finally, what is the target architecture (number of nodes, number of CPU 
> cores per node, etc.)?
> 
> I use 2/4/8/16/32 nodes, each has 20 physical cores
> 
>  
> 
> 
> thanks,
> greg
> 
> 
> 
> > On Aug 15, 2017, at 9:59 AM, Hui Zhang <[email protected]> wrote:
> >
> > Hello,
> >
> > I did some performance comparison between qthreads and fifo with 3 
> > benchmakrs: lulesh, hpl, and isx. I expected qthreads to outperform fifo in 
> > all cases, but the result turns out to be superising.
> > For lulesh and hpl, in all tests (#nodes from 2 to 32), qthreads is much 
> > slower (took 1.5~10x longer than that of fifo). For isx, qthreads beats 
> > fifo with speedup of 1.5~2x.
> >
> > All benchmarks compiled with --fast and I'm using 1.15. So is what I'm 
> > getting here reasonable? Any previous performance comparison between fifo 
> > and qthreads on those benchmarks?
> >
> > Thanks
> >
> > --
> > Best regards
> >
> >
> > Hui Zhang
> 
> > ------------------------------------------------------------------------------
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, Slashdot.org! 
> > http://sdm.link/slashdot_______________________________________________
> > Chapel-developers mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/chapel-developers
> 
> 
> 
> 
>  
> 
> --
> 
> Best regards
> 
> 
> Hui Zhang
> 
> 
> 
> 
> -- 
> Best regards
> 
> 
> Hui Zhang

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Re: [Chapel-developers] qthreads performance

Reply via email to