wasn't being limited by memory but I tried to get the memory usage of
each tez task down so it could spawn more tasks(but it didn't) Giving
tez more or less memory didn't really improve the performance.
How would one go about find out the limiting factor on the performance
of a job. would job
I have a map join in which the smaller tables together are 200 MB and
trying to have one block of main table be processed by one tez task.
...
What am I missing and is this even the right way of approaching the
problem ?
You need to be more specific about the Hive version. Hive-13 needs ~6x
My tez query seems to error out.
I have a map join in which the smaller tables together are 200 MB and
trying to have one block of main table be processed by one tez task.
Using the following formula to calculate the tez container size
Small table size + each block size + memory for sort +