Here are some rough thoughts:

* numCores() has been criticized as saying too much about a specific
  processor architecture -- i.e., what does it mean on a GPU, on an
  XMT, etc.?  I agree with this, but also like the idea of being able
  to make nonportable queries like this for locale types that logically
  support them because I think it's a nice clear way to think about the
  HW on which you're running.  I think the main downside is maintenance
  cost, but maybe I'm forgetting some other argument against doing so...

* To address the architecture-specificity of it, I think we should have
  a more general query that tells the "natural" degree of parallelism
  supported by the hardware.  A verbose name for it might be something
  like degreeOfHardwareParallelism() but I'm not actually suggesting we
  use that.  The idea is that this would be portable across all locale
  types and report something reasonable.

* Assuming numCores() persists as proposed in bullet #1, I think it should
  talk about the number of physical cores, not including hyperthreading.
  Then, I'd add an additional call to query the degree of virtual
  oversubscription.  Then the user can combine these values as they wish.

* In such a world, I'd imagine that the default value of
  dataParTasksPerLocale (and other such values) would either inherit the
  value from bullet 2; or it might heuristically determine a value based
  on other settings as well.  For instance, for Qthreads programs, we've
  seen better performance if dataParTasksPerLocale == number of physical
  processors as compared to the number of virtual processors taking
  hyperthreading into account.  So perhaps the qthreads tasking layer
  would set dataParTasksPerLocale to numCores() while fifo sets it to
  2*numCores() (assuming it sees benefit from that).

OK, tear this apart -- I'm sure I've got plenty of exposed mistakes,
-Brad




On Fri, 9 May 2014, Ben Harshbarger wrote:

Hi all,

I ran into an issue with hyper threading on OSX this morning, and after a
discussion with Brad some questions were posed that seem better suited for
a group discussion.

Currently numCores (at least on linux) includes the count for
hyper-threading. This introduces some ambiguity in the meaning of
‘here.numCores’ (physical vs. logical), and raises some questions:
 - What should numCores represent?
 - Should we separate the notion of physical cores and the amount of
parallelism in hardware (new variable/name)?
 - What does this mean for dataParTasksPerLocale?

Hopefully this gets the ball rolling.

-Ben

------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
• 3 signs your SCM is hindering your productivity
• Requirements for releasing software faster
• Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers
------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
• 3 signs your SCM is hindering your productivity
• Requirements for releasing software faster
• Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Reply via email to