Embedding Drill as a distributed query engine

Benjamin Schaff Tue, 21 Jan 2020 15:00:24 -0800

Hi everyone,

I would like to see if you could provide some recommendations/help around
integrating Apache Drill as a distributed sql engine in a custom database.
Maybe I am going about it the wrong way so any feedback is appreciated.


What I would like to achieve, is to be able to embed drillbits into my
database node, it's a distributed database written mostly in scala so it's
running inside the jvm. As you would expect, each storage node holds a
partition of the data and I would like for each SubScan to be routed to the
drillbit instance embedded within the database node.

At this point, drillbits are running communicating properly with zk (I am
using zookeeper for the database also). I can connect to the Plugin I
created using sqlline and I can list schemas and tables. So basically, all
the metadata part is done and working.

I managed to build-up the patitionwork and affinity using the distributed
metadata off the database and I am stuck in the following situation.

If I override the "DistributionAffinity getDistributionAffinity()" method
to put it to "HARD", then I end up with having the following error:
"IllegalArgumentException: Sender fragment endpoint list should not be
empty", and the "applyAssignments" method of the GroupScan receives and
empty list of endpoints.

If I don't override it then node without "local access" get some work
scheduled.

I was wondering if there was a way to exclude drillbits to become a foreman.

Thanks in advance for any guidance.

-- 
*This e-mail and any
attachments may contain confidential information and 
is intended for use solely
by the addressee(s).  If you are not the

intended recipient of this e-mail, please be aware that any dissemination,

distribution, copying, or other use of the e-mail in whole or in part, is

strictly prohibited.  If you have
received this e-mail in error, please 
notify the sender and permanently delete
the original and all copies of the 
e-mail, attachments, and any printouts. * **

Embedding Drill as a distributed query engine

Reply via email to