Wang Yan created TEZ-4110:
-----------------------------
Summary: Make Tez fail fast when DFS quota is exceeded
Key: TEZ-4110
URL: https://issues.apache.org/jira/browse/TEZ-4110
Project: Apache Tez
Issue Type: Improvement
Environment: hadoop 2.9, hive 2.3, tez
Reporter: Wang Yan
This ticket aims at creating a similar feature as MAPREDUCE-7148 in tez.
Make a tez job fail fast when dfs quota limitation is reached.
The background is : We are running hive jobs with a DFS quota limitation per
job(3TB). If a job hits DFS quota limitation, the task that hit it will fail
and there will be a few task reties before the job actually fails. The retry is
not very helpful because the job will always fail anyway. In some worse cases,
we have a job which has a single reduce task writing more than 3TB to HDFS over
20 hours, the reduce task exceeds the quota limitation and retries 4 times
until the job fails in the end thus consuming a lot of unnecessary resource.
This ticket aims at providing the feature to let a job fail fast when it writes
too much data to the DFS and exceeds the DFS quota limitation.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)