Hi All,
For those following along who have not tried Ted's idea (running multiple
Drillbits per host), note that when running two or more Drillbits per node, the
admin is responsible for choosing non-conflicting port numbers.
The port numbers are configured in drill-override.conf. See
drill-override-example.conf for more info. By default, drill-override.conf is
in $DRILL_HOME/conf, which would seem to imply that you must create a separate
copy of the Drill distro for each Drillbit on each node. You'd then start Drill
by pointing to the Drillbit-specific distro:
$DRILL_HOME1/bin/drillbit.sh start
For Drillbits 1, 2, 3...
An alternative is to use the site directory feature. You still need a separate
site directory per Drillbit, but they can share the Drill distro.
$DRILL_HOME/bin/drillbit.sh start --site $DRILL_SITE1
For a common $DRILL_HOME but separate sites for 1, 2, 3...
Yet another approach is to pass the ports on the command line. The config
system is supposed to allow this. I've not personally tested this, so caveat
emptor:
$DRILL_HOME/bin/drillbit.sh start -Ddrill.exec.rpc.user.server.port=31110
You could wrap the above in a script so you can share both the Drill distro and
config across Drillbits.
Thanks,
- Paul
On Monday, August 27, 2018, 6:17:11 AM PDT, John Omernik <[email protected]>
wrote:
I will +1 Ted's idea. By doing small drillbits, it does take a bit more
overhead, but you also have an ability to scale your Drill cluster size
(especially using the Drillbit shutdown features added recently).
On Wed, Aug 22, 2018 at 8:23 PM, Ted Dunning <[email protected]> wrote:
> Cool
>
> On Wed, Aug 22, 2018, 17:07 scott <[email protected]> wrote:
>
> > Thanks Ted and Paul. I've been experimenting with the "hack" method. It
> > works somewhat, and I guess will have to do.
> >
> > On Tue, Aug 21, 2018 at 2:50 PM Ted Dunning <[email protected]>
> wrote:
> >
> > > A cheap hack is to use multiple smaller drillbits. Put more drillbits
> on
> > > the hefty machines and fewer on the weaker ones.
> > >
> > > This increases overheads, but it might help you out.
> > >
> > >
> > >
> > > On Tue, Aug 21, 2018 at 1:48 PM scott <[email protected]> wrote:
> > >
> > > > Hi community,
> > > > I am trying to find a way to tune Drill so that weaker drillbits get
> > less
> > > > data to work on so that the weak link doesn't drag my performance
> > down. I
> > > > have drillbits running on a variety of hardware and sometimes these
> > > shared
> > > > resources get really slow. It seems that the query plan always evenly
> > > > divides the data fragments so that each drillbit gets the same data
> to
> > > chew
> > > > on. How do I make it give weaker drillbits less data?
> > > >
> > > > Alternatively, is there a way to limit and queue fragments of the
> query
> > > and
> > > > leave them unassigned, then assign to drillbits as their resources
> > become
> > > > free, similar to MapReduce?
> > > >
> > > > Thanks for you time,
> > > > Scott
> > > >
> > >
> >
>