On the right distribution, you can restrict the subset of the cluster that
has the data you need to avoid locality variation when Drill only runs on a
subset of nodes.



On Thu, Jul 14, 2016 at 6:48 AM, François Méthot <[email protected]>
wrote:

> We have observed that if the number of drillbits is lower than the number
> of nodes in our cluster, some minor fragment takes longer to complete their
> query (We hypothesize that it is because they can't take advantage of data
> locality, fragment has to reach out for data on a different node). One
> drillbit to one node, with evenly spread data is the best scenario.
>
> These results may also vary depending on your hardware I think.
>
>
>
>
>
>
>
> On Thu, Jul 7, 2016 at 7:06 PM, Ashish Goel <[email protected]>
> wrote:
>
> > That's an interesting question. I would also be curious to learn more
> about
> > this. Did anyone run any benchmarks around this? It would be helpful to
> > understand.
> >
> > On Thu, Jul 7, 2016 at 11:13 AM, scott <[email protected]> wrote:
> >
> > > Abdel,
> > > I didn't ask about having more than one drillbit per node. I asked
> about
> > > the number of drillbits per cluster. For instance, if I had a 1000 node
> > > Hadoop cluster, should I install drillbits on each node? Or, is there
> > some
> > > point at which the interaction of 1000 drillbits causes contention
> > > resulting in a plateau or decline of performance?
> > >
> > > Thanks,
> > > Scott
> > >
> > > On Thu, Jul 7, 2016 at 5:00 PM, Abdel Hakim Deneche <
> > [email protected]
> > > >
> > > wrote:
> > >
> > > > I'm not sure you'll get any performance improvement from running more
> > > than
> > > > a single drillbit per cluster node.
> > > >
> > > > On Thu, Jul 7, 2016 at 9:47 AM, scott <[email protected]> wrote:
> > > >
> > > > > Follow up question: Is there a sweet spot for
> DRILL_MAX_DIRECT_MEMORY
> > > and
> > > > > DRILL_HEAP settings?
> > > > >
> > > > > On Wed, Jul 6, 2016 at 2:42 PM, scott <[email protected]> wrote:
> > > > >
> > > > > > Hello,
> > > > > > Does anyone know if there is a maximum number of drillbits
> > > recommended
> > > > in
> > > > > > a Drill cluster? For example, I've observed that in a Solr Cloud,
> > the
> > > > > > performance tapers off for ingest at around 16 JVM instances. Is
> > > there
> > > > a
> > > > > > similar practical limitation to the number of drillbits I should
> > > > cluster
> > > > > > together?
> > > > > >
> > > > > > Thanks,
> > > > > > Scott
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Abdelhakim Deneche
> > > >
> > > > Software Engineer
> > > >
> > > >   <http://www.mapr.com/>
> > > >
> > > >
> > > > Now Available - Free Hadoop On-Demand Training
> > > > <
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Thanks,
> > Ashish
> >
>

Reply via email to