drillbit colocation questions

Wesley Chow Mon, 06 Jun 2016 12:03:07 -0700

I have some general questions that I've been unable to google. I'm
particularly interested in co-locating drillbits with nodes in a custom
store of ours, so I've been poking around in source and searching about for
examples of this.


1. My understanding is that Drill understands HDFS and if you co-locate a
drillbit with a data node, then Drill will automatically distribute queries
to the drillbits on the nodes that contain the relevant files.

1a. Where does drill run a join then? On the node that initiated the query,
or on one of the nodes that contain the data?

1b. Does Drill automatically look up which nodes hold the data in question,
or is this specified in the query somehow?

2. Does drill also understand data distribution in HBase? Do queries get
sent to nodes that contain the HBase rows in question?

3. We have a custom data store that we'd like to be Drill aware, but want a
drillbit on the machine itself. Are there any examples of co-locating
drillbits with non-HDFS data sources?

4. If we place files on a bunch of different servers and install drillbits
on each one, and we determine which servers contain which files
out-of-band, is there a way to submit a query to drill that tells it which
nodes contain local files to read?

Btw, I would be really interested in chatting /drinking with someone who
nows the Drill code well and is based in NYC.

Thanks,
Wes

drillbit colocation questions

Reply via email to