Paul, Thanks for the reality side of this. Configuring a system to handle unusual setups can definitely be a challenge.
Btw, the general term for running several sub-scale workers on each node to allow more flexibility is "micro-sharding". On Mon, Aug 27, 2018 at 3:24 PM Paul Rogers <par0...@yahoo.com.invalid> wrote: > Hi All, > > For those following along who have not tried Ted's idea (running multiple > Drillbits per host), note that when running two or more Drillbits per node, > the admin is responsible for choosing non-conflicting port numbers. > > The port numbers are configured in drill-override.conf. See > drill-override-example.conf for more info. By default, drill-override.conf > is in $DRILL_HOME/conf, which would seem to imply that you must create a > separate copy of the Drill distro for each Drillbit on each node. You'd > then start Drill by pointing to the Drillbit-specific distro: > > $DRILL_HOME1/bin/drillbit.sh start > > For Drillbits 1, 2, 3... > > An alternative is to use the site directory feature. You still need a > separate site directory per Drillbit, but they can share the Drill distro. > > $DRILL_HOME/bin/drillbit.sh start --site $DRILL_SITE1 > > For a common $DRILL_HOME but separate sites for 1, 2, 3... > > Yet another approach is to pass the ports on the command line. The config > system is supposed to allow this. I've not personally tested this, so > caveat emptor: > > $DRILL_HOME/bin/drillbit.sh start -Ddrill.exec.rpc.user.server.port=31110 > > You could wrap the above in a script so you can share both the Drill > distro and config across Drillbits. > > Thanks, > - Paul > > > > On Monday, August 27, 2018, 6:17:11 AM PDT, John Omernik < > j...@omernik.com> wrote: > > I will +1 Ted's idea. By doing small drillbits, it does take a bit more > overhead, but you also have an ability to scale your Drill cluster size > (especially using the Drillbit shutdown features added recently). > > > > On Wed, Aug 22, 2018 at 8:23 PM, Ted Dunning <ted.dunn...@gmail.com> > wrote: > > > Cool > > > > On Wed, Aug 22, 2018, 17:07 scott <tcots8...@gmail.com> wrote: > > > > > Thanks Ted and Paul. I've been experimenting with the "hack" method. It > > > works somewhat, and I guess will have to do. > > > > > > On Tue, Aug 21, 2018 at 2:50 PM Ted Dunning <ted.dunn...@gmail.com> > > wrote: > > > > > > > A cheap hack is to use multiple smaller drillbits. Put more drillbits > > on > > > > the hefty machines and fewer on the weaker ones. > > > > > > > > This increases overheads, but it might help you out. > > > > > > > > > > > > > > > > On Tue, Aug 21, 2018 at 1:48 PM scott <tcots8...@gmail.com> wrote: > > > > > > > > > Hi community, > > > > > I am trying to find a way to tune Drill so that weaker drillbits > get > > > less > > > > > data to work on so that the weak link doesn't drag my performance > > > down. I > > > > > have drillbits running on a variety of hardware and sometimes these > > > > shared > > > > > resources get really slow. It seems that the query plan always > evenly > > > > > divides the data fragments so that each drillbit gets the same data > > to > > > > chew > > > > > on. How do I make it give weaker drillbits less data? > > > > > > > > > > Alternatively, is there a way to limit and queue fragments of the > > query > > > > and > > > > > leave them unassigned, then assign to drillbits as their resources > > > become > > > > > free, similar to MapReduce? > > > > > > > > > > Thanks for you time, > > > > > Scott > > > > > > > > > > > > > > >