Well, good to know :)

No difference for two builds (I made clean builds in two separate
copy of Chapel 1.15 with the only difference being the CHPL_TASKS)

On Tue, Aug 15, 2017 at 6:35 PM, Greg Titus <[email protected]> wrote:

> Hello Hui --
>
> I think you already are risking correctness for performance.  :-)
>
> One of the side effects of throwing '--fast' at compile time is to disable
> guard pages by default at execution time.  Plus, while fifo and qthreads
> tasking don’t behave identically with respect to guard pages, they do both
> support them and just a difference in guard page setting wouldn’t cause a
> huge difference in performance between the two unless the app or benchmark
> did a lot of task creation.  So, I don’t think that’s it.
>
> A further question, then: were debug and optimize settings the same for
> both your runtime builds (tasks=fifo and =qthreads), and for the
> third-party package builds for qthreads and hwloc?
>
> greg
>
>
> > On Aug 15, 2017, at 3:27 PM, Hui Zhang <[email protected]> wrote:
> >
> > Thanks, Aji,
> >
> > I've verified that it was configured with "--enable-guard-pages" by
> default when I built Chapel. But is that necessary for Chapel? Any thoughts
> from Chapel? I don't want to take the risk of correctness for performance.
> > Thanks
> >
> > On Tue, Aug 15, 2017 at 4:50 PM, Aji, Ashwin <[email protected]> wrote:
> > These are my 2 cents on measuring qthreads performance before in Chapel.
> If you configured qthreads with “--enable-guard-pages”, then the
> performance will be much slower than without enabling guard pages. It may
> be worthwhile to see how you have configured qthreads.
> >
> >
> >
> > Regards,
> >
> > Ashwin
> >
> >
> >
> > From: Hui Zhang [mailto:[email protected]]
> > Sent: Tuesday, August 15, 2017 11:50 AM
> > To: Greg Titus <[email protected]>
> > Cc: Chapel Sourceforge Developers List <chapel-developers@lists.
> sourceforge.net>
> > Subject: Re: [Chapel-developers] qthreads performance
> >
> >
> >
> > Hi, Greg
> >
> >
> >
> > On Tue, Aug 15, 2017 at 1:35 PM, Greg Titus <[email protected]> wrote:
> >
> > Hello Hui --
> >
> > Generally CHPL_TASKS=qthreads outperforms CHPL_TASKS=fifo at all but the
> smallest scales.  We would need to know a lot more to come to any
> worthwhile conclusions.  What is the output of `printchplenv --anonymize`
> for your configurations (I assume they differ  only in terms of the
> CHPL_TASKS setting)?
> >
> > ​
> >
> > CHPL_TARGET_PLATFORM: linux64
> >
> > CHPL_TARGET_COMPILER: gnu
> >
> > CHPL_TARGET_ARCH: native *
> >
> > CHPL_LOCALE_MODEL: flat
> >
> > CHPL_COMM: gasnet *
> >
> >   CHPL_COMM_SUBSTRATE: ibv *
> >
> >   CHPL_GASNET_SEGMENT: large
> >
> > CHPL_TASKS: qthreads
> >
> > CHPL_LAUNCHER: gasnetrun_ibv *
> >
> > CHPL_TIMERS: generic
> >
> > CHPL_UNWIND: none
> >
> > CHPL_MEM: jemalloc
> >
> > CHPL_MAKE: gmake
> >
> > CHPL_ATOMICS: intrinsics
> >
> >   CHPL_NETWORK_ATOMICS: none
> >
> > CHPL_GMP: gmp
> >
> > CHPL_HWLOC: hwloc
> >
> > CHPL_REGEXP: re2
> >
> > CHPL_WIDE_POINTERS: struct
> >
> > CHPL_AUX_FILESYS: none
> >
> > Yes, the only difference is CHPL_TASKS.​
> >
> >
> >
> > Are you using any compilation options other than ‘--fast’?  What
> execution options are you using?
> >
> > ​For hpl:  --n=500 --printArray=false --printStacts=true
> --useRandomSeed=false -nl *
> >
> > For lulesh: ​
> >
> >
> >
> > ​--filename=lmeshes/sedov15oct.lmesh -nl *
> >
> > For isx:  --nide-weakISO --n=5592400 --numTrials=10​ -nl *
> >
> > Are you setting any execution-time environment variables (CHPL_RT_*) and
> if so, to what values?
> >
> > ​NO​
> >
> >
> >
> > And finally, what is the target architecture (number of nodes, number of
> CPU cores per node, etc.)?
> >
> > ​I use 2/4/8/16/32 nodes, each has 20 physical cores​
> >
> >
> >
> >
> > thanks,
> > greg
> >
> >
> >
> > > On Aug 15, 2017, at 9:59 AM, Hui Zhang <[email protected]>
> wrote:
> > >
> > > Hello,
> > >
> > > I did some performance comparison between qthreads and fifo with 3
> benchmakrs: lulesh, hpl, and isx. I expected qthreads to outperform fifo in
> all cases, but the result turns out to be superising.
> > > For lulesh and hpl, in all tests (#nodes from 2 to 32), qthreads is
> much slower (took 1.5~10x longer than that of fifo). For isx, qthreads
> beats fifo with speedup of 1.5~2x.
> > >
> > > All benchmarks compiled with --fast and I'm using 1.15. So is what I'm
> getting here reasonable? Any previous performance comparison between fifo
> and qthreads on those benchmarks?
> > >
> > > Thanks
> > >
> > > --
> > > Best regards
> > >
> > >
> > > Hui Zhang
> >
> > > ------------------------------------------------------------
> ------------------
> > > Check out the vibrant tech community on one of the world's most
> > > engaging tech sites, Slashdot.org! http://sdm.link/slashdot______
> _________________________________________
> > > Chapel-developers mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/chapel-developers
> >
> >
> >
> >
> >
> >
> > --
> >
> > Best regards
> >
> >
> > Hui Zhang
> >
> >
> >
> >
> > --
> > Best regards
> >
> >
> > Hui Zhang
>
>


-- 
Best regards


Hui Zhang
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Reply via email to