Amit, DUCC should use all available resources as configured by your ducc.classes and ducc.nodes files.
Lou. On Thu, Oct 16, 2014 at 11:59 AM, Amit Gupta <[email protected]> wrote: > Thanks for the clarification Burn, > > So indeed there is no way to "force" a job to scale out to maximum > resources available? > > What I'm finding is that even though a job takes > 1 hour to complete using > 2 nodes, it doesnt use some extra available nodes which are part of the > ducc cluster. > > a. Is there no configuration option to deal with this (I'm guessing this > requirement may have come up before) ? > > b. Would you happen to know what part of UIMA code makes that decision (i.e > the trigger to spawn a process on a new node or not) ? > > > Thanks again for you help, > > Best, > Amit > > > > > > On Thu, Oct 16, 2014 at 9:32 AM, Burn Lewis <[email protected]> wrote: > > > Yes, that parameter only limits the maximum scaleout. DUCC will ramp up > > the number of processors based on the available resources and the amount > of > > work to be done. It initially starts only 1 or 2 and only when one > > initializes successfully will it start more. It may not start more if it > > suspects that all the work will be completed on the existing nodes before > > any new ones are ready. > > > > There is an additional type of scaleout, within each process, controlled > by > > --process_thread_count which controls how many threads in each process > are > > capable of processing separate work items. > > > > ~Burn > > > > On Wed, Oct 15, 2014 at 7:11 PM, Amit Gupta <[email protected]> > > wrote: > > > > > Hi, > > > I've been trying to find the options related to configuration of > scaleout > > > of a ducc job. > > > > > > Thus far the only ones Ive found are: > > > > > > process_deployments_max: > > > which limits the maximum number of processes spawned by a ducc job. > > > > > > At what point does DUCC decide to spawn a new process or spread > > processing > > > out to a new node. Is there a tuning parameter for an optimal number of > > > work items per process spawned? Can the user control this behavior? > > > > > > For example, > > > I have a job large enough that DUCC natively spreads it across 2 nodes. > > > I havent been able to force this job, via a config parameter, to spread > > > across 4 nodes (or "X" nodes) for faster processing times. > > > > > > Does anyone know if theres a parameter than can directly control > scaleout > > > in this manner? > > > > > > Thanks, > > > > > > -- > > > Amit Gupta > > > > > > > > > -- > Amit Gupta >
