Hi Gopal,

Thanks a lot for the answers. They were helpful.
I have a few more questions regarding this:

1. OOM condition -- I get the following error when I force a map join in
hive/tez with low container size and heap size:"
java.lang.OutOfMemoryError: Java heap space". I was wondering what is the
condition which leads to this error. Is it when, even one of the 16
partitions of hashtable, cannot fit in memory? I tried setting
hive.mapjoin.hybridgrace.minnumpartitions to higher values like 50 and 100
(expecting the size of each partition to drop drastically  [the join key
doesn't have skew in the distribution] ). But I still get the OOM error.
What is the cause?

2.  Shuffle Hash Join -- I am using hive 2.0.1. What is the way to force
this join implementation? Is there any documentation regarding the same?

3. Hash table size: I use "--hiveconf hive.root.logger=INFO,console" for
seeing logs. I do not see the hash table size in these logs. I tried using:
 "--hiveconf hive.tez.exec.print.summary=true". The output of this was the
following:

METHOD                         DURATION(ms)
parse                                  977
semanticAnalyze                      2,435
TezBuildDag                            473
TezSubmitToRunningDag                  841
TotalPrepTime                       10,451

VERTICES         TOTAL_TASKS  FAILED_ATTEMPTS KILLED_TASKS DURATION_SECONDS
   CPU_TIME_MILLIS     GC_TIME_MILLIS  INPUT_RECORDS   OUTPUT_RECORDS
Map 1                      3                0            0            37.11
            65,710              1,039     15,000,000       15,000,000
Map 2                     65                0            0           498.77
         4,163,920            100,613    615,037,902                0
This doesn't seem to have information about the hash table or #items
shuffled. Am I missing something ?


Thanks,
Ross

On Mon, Jun 27, 2016 at 9:10 AM, Gopal Vijayaraghavan <gop...@apache.org>
wrote:

> > 1. Is there a way to check the size of the hash table created during map
> >side join in Hive/Tez?
>
> Only from the log files.
>
> However, you enable hive.tez.exec.print.summary=true; then the hive CLI
> will print out the total # of items shuffle from the broadcast edges
> feeding the hashtable.
>
> Not sure if you might have the reduce-side map-join in your builds, but
> that is a bit harder to look into.
>
> > 2. Is the hash table (small table's), created for the entire table or
> >only for the selected and join key columns?
>
> Yup. Project, filter and group-by (in case of semi-joins).
>
> select a from tab1 where a IN (select b from tab2 ...);
>
> will inject a "select distinct b" into the plan.
>
> > 3. The hash table (created in map side join) spills to disk, if it does
> >not fit in memory Is there a parameter in hive/tez to specify the
> >percentage of the hash file which can spill?
>
> Not directly.
>
> hive.mapjoin.hybridgrace.minnumpartitions=16
>
>
> by default. So 1/16th of your key space will spill, whenever it hits the
> spilling conditions - for the small table.
>
> In general, the Snowflake-model dimension tables are joined by their
> primary key, so the key-space corresponds to the row-distribution too.
>
> For the big table it will spill only a smaller fraction since the
> BloomFilter built during hashtable generation is not spilled, so anything
> which misses the bloom filter will not spill to disk during the join.
>
> All that said, the spilling hash-join is much slower than the shuffling
> hash-join (new in 2.0), because the grace hash-join drops parts of the
> hash out of memory after each iteration & has to rebuild it for each split
> processed.
>
> In terms of total CPU, building a 4 million row hash table in 600 tasks is
> slower than building a 7500 row hashtable x 600 times - the hashtable
> lookup goes up by LG(N) too.
>
> Ask me more questions, if you need more info.
>
> Cheer,
> Gopal
>
>
>

Reply via email to