Re: Large Table Joins

Uli Bethke Fri, 13 Feb 2015 09:06:07 -0800

Thanks Aman. This answers my question. I suppose as a workaround for thetime being I could denormalize the smaller of the large tables into thebigger one.

I would also be interested in the opinions of the group on dataco-locality (as implemented by Teradata). This is not so much a Drillquestion but rather related to the distributed file system.

As far as I know, data co-locality of blocks can not be implemented inHDFS what is the situation in MapR-FS?


On 13/02/2015 16:45, Aman Sinha wrote:

Drill joins (either hash join or merge join) currently generate 3 types of
plans:  hash distribute both sides of the join, hash distribute left side
and broadcast right side,  broadcast right side and don't distribute left
side.  These are cost-based decisions based on cardinalities.  However, the
partitioning information of the table is not currently used because at
least for now the partitioning information of the table is not exposed to
Drill.

This does not mean it cannot be added in the future.  Currently, Drill does
not maintain a catalog or extensive metadata - it is designed for
schema-less data.  So in your use case it would end up re-distributing the
data.  But we have plans for exposing the distribution information to the
planner.

On Fri, Feb 13, 2015 at 8:37 AM, Uli Bethke <[email protected]> wrote:

Thanks Jason.

Just a bit more background on my question. Modern MPPs such as Teradata
allow for full data co-locality via hash distribution of keys. This ensures
that join data of two large tables will always end up on the same node and
data co-locality is always ensured (no network overhead), which makes large
table joins ultra fast. In Hive we have bucketing, which somewhat
alleviates the problem. However, it does not guarantee data co-locality as
two corresponding bucket files may end up on two different nodes. While it
reduces network overhead it does not eliminate it.



On 13/02/2015 16:25, Jason Altekruse wrote:

I don't think this actually answers your question. You can limit your
filters by directory to avoid reads from the filesystem, and some of the
storage plugins like Hbase and Hive implement scan level pushdown, but I
do
not know if this is sophisticated enough that a join would be aware of the
partitioning. I'll keep watching the thread and reach out to our planning
experts if they don't chime in.

- Jason

On Fri, Feb 13, 2015 at 5:45 AM, Carol McDonald <[email protected]>
wrote:

  yes you can read about it here

https://cwiki.apache.org/confluence/display/DRILL/Partition+Pruning

On Fri, Feb 13, 2015 at 6:42 AM, Uli Bethke <[email protected]> wrote:

  I have two large tables in Hive (both partitioned and bucketed). Using
Map

side joins I can take advantage of data locality for the hash join
table.

Using Drill does the optimizer take the partitioning and bucketing into
consideration?
thanks
uli

--
___________________________
Uli Bethke
Co-founder Sonra
p: +353 86 32 83 040
w: www.sonra.io
l: linkedin.com/in/ulibethke
t: twitter.com/ubethke


--
___________________________
Uli Bethke
Co-founder Sonra
p: +353 86 32 83 040
w: www.sonra.io
l: linkedin.com/in/ulibethke
t: twitter.com/ubethke

Re: Large Table Joins

Reply via email to