Drill for Data Virtualization

Sarnath K Wed, 10 Apr 2019 04:19:30 -0700

Hi,

I posted in the user group and did not get any response. Since this usecase
may be rare, I wanted to check with you all. Apologies for posting in both
lists.


I have a requirement where I need to split data between a fast RDBMS system
(A) that will have HOT data and a slower cold storage (B)

Both A and B provide JDBC drivers

I am looking to see if Drill will help me in coming with a JDBC URL (C)
which will hide the fact that data is split between A and B. i.e. Can Drill
be used to implement Data Virtualization?

As much as I can read about Drill, I can definitely create 2 tables in
Drill one pointing to A and another to B. And then write a UNION query and
expose it as a View.
However when I do GROUP BY queries or FILTER queries on such a view -- Does
Drill take advantage of the existing JDBC systems by actually sending a
part of the GROUP BY to A and another to B and then fully reduce the
partially reduced results from A and B ? i.e. Some kind of smart predicate
push-down for Analytical queries? OR Will Drill read full data of the Union
and then perform GROUPing under it's purview?

Hope I sound clear to you. Appreciate your response much.

Thank you,

Best,
Sarnath

Drill for Data Virtualization

Reply via email to