Ray Spence <[email protected]> writes:

> All,
>
> I've read enough to try implementing all or some combination of the JSV and
> prolog
> suggestions. Per Dave's comment on core binding on recent AMD hardware - we
> are
> running Ubuntu 12.04 on 16-core Opteron 6272 cpus, introduced 11/2011.

So you need a version of SGE where the binding/topology works properly,
and do binding with it:

  $ qstat -help|head -n 1
  SGE 8.1.2
  $  ssh node246 grep -m1 model.name /proc/cpuinfo
  model name    : AMD Opteron(TM) Processor 6276
  $ qhost -F -h node246 | grep hl:m_
  
hl:m_topology=SCCCCCCCCCCCCCCCCSCCCCCCCCCCCCCCCCSCCCCCCCCCCCCCCCCSCCCCCCCCCCCCCCCC
  hl:m_socket=4.000000
  hl:m_core=64.000000
  hl:m_thread=64.000000
  
hl:m_topology_inuse=SCCCCCCCCCCCCCCCCSCCCCCCCCCCCCCCCCSCCCCCCCCCCCCCCCCSCCCCCCCCCCCCCCCC

If you're using the Ubuntu gridengine package, my advice would be don't,
for various reasons.

[You should also make sure your MPI library works properly (if you use
MPI); for open-mpi that means at least version 1.6.1.  I'd better not
start the extensive Interlagos lecture, which we unfortunately haven't
written up, but note that if you don't bind processes/threads, you can
lose a factor of two performance straight off, with scope for more.]

> So,
> we'll see
> what happens with a JSV and/or a prolog script as Reuti wrote. I'm assuming
> these two
> aren't mutually exclusive?

I suppose that depends what you're doing, but what Reuti suggested won't
work on your hardware and that SGE.  I've posted fragments of my JSV and
sge_request before, but the sge_request default depends on the current
version.  (I don't mean there's anything wrong with what Reuti says, of
course.)  I'm not sure how the prolog is useful.

> The JSV logic dictates one core if no PE is
> submitted in qsub
> while the prolog only controls omp_num_threads.

> We also have cgroups available. Any suggestions for controlling cpu
> affinity via cgroups
> inside SGE?

Depending on what "cpu affinity" means, what do you want, other than
this, that doesn't have to be done in an openmp program itself?:

>> It makes a difference what OS and hardware is involved, but for
>> GNU/Linux -- short of tricks with setuid starter methods -- as far as I
>> know, the only currently-available way of confining jobs is with the
>> cpuset support in sge-8.1.2, described in
>> <http://arc.liv.ac.uk/SGE/howto/remove_orphaned_processes.html>.

-- 
Community Grid Engine:  http://arc.liv.ac.uk/SGE/
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to