The split is skewed. Just running one sqoop action will cause some
containers to finish early and others to finish late. If we run the actions
concurrently, the early finishers will be idle until all containers for
that action is done and the next action can commence. By running the
actions in parallel, we will finish earlier in total and also utilize our
cluster resources better.

regards
/Pelle

On Thu, May 26, 2016 at 3:09 AM, Robert Kanter <[email protected]> wrote:

> Hi,
>
> If you want to only run one of the Sqoop Actions at a time, why not simply
> remove the fork and run the Sqoop Actions sequentially?
>
> - Robert
>
> On Tue, May 3, 2016 at 12:15 AM, Per Ullberg <[email protected]>
> wrote:
>
> > Hi,
> >
> > We have an oozie workflow that imports data table by table from a RDBMS
> > using sqoop. One action per table. The sqoop commands use "split by
> column"
> > and spread out on a number of mappers.
> >
> > We fork all the actions so basically all sqoop jobs are launched at once.
> >
> > The RDBMS can only accept a fixed number of connections and if this is
> > exceeded, the sqoop action will fail and eventually the whole oozie
> > workflow will fail.
> >
> > We use the yarn capacity scheduler (2.6.0) and have set up a specific
> queue
> > for this job to throttle the maximum number of concurrent containers.
> > However, this setup is hard to manage because all configurations in the
> > capacity scheduler are relative to the max amount of vcores of the
> cluster
> > and as we add machines or otherwise tune the cluster, the actual number
> of
> > containers granted to the oozie job changes and at times we hit the
> > connection roof.
> >
> > So, is there another way to throttle the number of concurrent containers
> > for an oozie job? I guess you would have to be able to throttle both
> > launchers and map-reduce containers?
> >
> > best regards
> > /Pelle
> >
> >
> > --
> >
> > *Per Ullberg*
> > Tech Lead
> > Odin - Uppsala
> >
> > Klarna AB
> > Sveavägen 46, 111 34 Stockholm
> > Tel: +46 8 120 120 00
> > Reg no: 556737-0431
> > klarna.com
> >
>



-- 

*Per Ullberg*
Tech Lead
Odin - Uppsala

Klarna AB
Sveavägen 46, 111 34 Stockholm
Tel: +46 8 120 120 00
Reg no: 556737-0431
klarna.com

Reply via email to