If your job has unprocessed work then perhaps the unused nodes are not in
the scheduling class you specified, or are too small.  Note that the
example below has all of its work either completed or active, so has no
work waiting to be processed.

State: Running  Workitems: 16  Done: 12  Error: 0  Dispatch: 4  Unassigned:
0  Limbo: 0
~Burn

On Thu, Oct 16, 2014 at 2:38 PM, Lou DeGenaro <[email protected]>
wrote:

> Amit,
>
> DUCC should use all available resources as configured by your ducc.classes
> and ducc.nodes files.
>
> Lou.
>
>
> On Thu, Oct 16, 2014 at 11:59 AM, Amit Gupta <[email protected]>
> wrote:
>
> > Thanks for the clarification Burn,
> >
> > So indeed there is no way to "force" a job to scale out to maximum
> > resources available?
> >
> > What I'm finding is that even though a job takes > 1 hour to complete
> using
> > 2 nodes, it doesnt use some extra available nodes which are part of the
> > ducc cluster.
> >
> > a. Is there no configuration option to deal with this (I'm guessing this
> > requirement may have come up before) ?
> >
> > b. Would you happen to know what part of UIMA code makes that decision
> (i.e
> > the trigger to spawn a process on a new node or not) ?
> >
> >
> > Thanks again for you help,
> >
> > Best,
> > Amit
> >
> >
> >
> >
> >
> > On Thu, Oct 16, 2014 at 9:32 AM, Burn Lewis <[email protected]> wrote:
> >
> > > Yes, that parameter only limits the maximum scaleout.  DUCC will ramp
> up
> > > the number of processors based on the available resources and the
> amount
> > of
> > > work to be done.  It initially starts only 1 or 2 and only when one
> > > initializes successfully will it start more.  It may not start more if
> it
> > > suspects that all the work will be completed on the existing nodes
> before
> > > any new ones are ready.
> > >
> > > There is an additional type of scaleout, within each process,
> controlled
> > by
> > > --process_thread_count which controls how many threads in each process
> > are
> > > capable of processing separate work items.
> > >
> > > ~Burn
> > >
> > > On Wed, Oct 15, 2014 at 7:11 PM, Amit Gupta <[email protected]>
> > > wrote:
> > >
> > > > Hi,
> > > > I've been trying to find the options related to configuration of
> > scaleout
> > > > of a ducc job.
> > > >
> > > > Thus far the only ones Ive found are:
> > > >
> > > > process_deployments_max:
> > > > which limits the maximum number of processes spawned by a ducc job.
> > > >
> > > > At what point does DUCC decide to spawn a new process or spread
> > > processing
> > > > out to a new node. Is there a tuning parameter for an optimal number
> of
> > > > work items per process spawned? Can the user control this behavior?
> > > >
> > > > For example,
> > > > I have a job large enough that DUCC natively spreads it across 2
> nodes.
> > > > I havent been able to force this job, via a config parameter, to
> spread
> > > > across 4 nodes (or "X" nodes) for faster processing times.
> > > >
> > > > Does anyone know if theres a parameter than can directly control
> > scaleout
> > > > in this manner?
> > > >
> > > > Thanks,
> > > >
> > > > --
> > > > Amit Gupta
> > > >
> > >
> >
> >
> >
> > --
> > Amit Gupta
> >
>

Reply via email to