> On Tez, this is run as a single DAG of M-R+ ... Can't tell which vertex is the slow one in this.
More tooling for isolating which vertex is taking up time (and which task) https://github.com/apache/tez/tree/master/tez-tools/swimlanes or alternatively run https://github.com/t3rmin4t0r/tez-swimlanes/blob/master/vertex.py The first one should get you a graph which a lot like http://people.apache.org/~gopalv/query4.svg and the 2nd one should get you something which looks like http://people.apache.org/~gopalv/q21_suppliers_who_kept_orders_waiting.svg (note skewed tail in Reducer 3) > It gets stuck due to some large apps in the 1st Reducer Phase while >holding all subsequent 12 Reducer phases until the final Reducer in the >2nd phase is finished. You're splitting the sort buffers 12-way. > Are there things in Tez I can leverage or change my query to make it >conducive for Tez to deal with skew better? Usually Tez runs all containers using the Mapper Xmx values, if left unconfigured. Most of the times the perf diff is reported, it's due to the use of 1.5Gb containers (and 6Gb reducers in MRv2). Assuming that isn't the case, get the other SVGs produced - should tell me exactly what's wrong. Tez doesn't introduce skews in general, but the impact of dividing io.sort.mb into 12 chunks might be a problem. Cheers, Gopal PS: in 0.8.2, the tooling actually gets you something like - https://issues.apache.org/jira/secure/attachment/12751186/criticalPath.jpg