On the right distribution, you can restrict the subset of the cluster that has the data you need to avoid locality variation when Drill only runs on a subset of nodes.
On Thu, Jul 14, 2016 at 6:48 AM, François Méthot <[email protected]> wrote: > We have observed that if the number of drillbits is lower than the number > of nodes in our cluster, some minor fragment takes longer to complete their > query (We hypothesize that it is because they can't take advantage of data > locality, fragment has to reach out for data on a different node). One > drillbit to one node, with evenly spread data is the best scenario. > > These results may also vary depending on your hardware I think. > > > > > > > > On Thu, Jul 7, 2016 at 7:06 PM, Ashish Goel <[email protected]> > wrote: > > > That's an interesting question. I would also be curious to learn more > about > > this. Did anyone run any benchmarks around this? It would be helpful to > > understand. > > > > On Thu, Jul 7, 2016 at 11:13 AM, scott <[email protected]> wrote: > > > > > Abdel, > > > I didn't ask about having more than one drillbit per node. I asked > about > > > the number of drillbits per cluster. For instance, if I had a 1000 node > > > Hadoop cluster, should I install drillbits on each node? Or, is there > > some > > > point at which the interaction of 1000 drillbits causes contention > > > resulting in a plateau or decline of performance? > > > > > > Thanks, > > > Scott > > > > > > On Thu, Jul 7, 2016 at 5:00 PM, Abdel Hakim Deneche < > > [email protected] > > > > > > > wrote: > > > > > > > I'm not sure you'll get any performance improvement from running more > > > than > > > > a single drillbit per cluster node. > > > > > > > > On Thu, Jul 7, 2016 at 9:47 AM, scott <[email protected]> wrote: > > > > > > > > > Follow up question: Is there a sweet spot for > DRILL_MAX_DIRECT_MEMORY > > > and > > > > > DRILL_HEAP settings? > > > > > > > > > > On Wed, Jul 6, 2016 at 2:42 PM, scott <[email protected]> wrote: > > > > > > > > > > > Hello, > > > > > > Does anyone know if there is a maximum number of drillbits > > > recommended > > > > in > > > > > > a Drill cluster? For example, I've observed that in a Solr Cloud, > > the > > > > > > performance tapers off for ingest at around 16 JVM instances. Is > > > there > > > > a > > > > > > similar practical limitation to the number of drillbits I should > > > > cluster > > > > > > together? > > > > > > > > > > > > Thanks, > > > > > > Scott > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Abdelhakim Deneche > > > > > > > > Software Engineer > > > > > > > > <http://www.mapr.com/> > > > > > > > > > > > > Now Available - Free Hadoop On-Demand Training > > > > < > > > > > > > > > > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > > > > > > > > > > > > > > > > > > > -- > > Thanks, > > Ashish > > >
