LoggingResearch created MAPREDUCE-7486:
------------------------------------------

             Summary: Handling Cluster Storage Capacity Exceeded Exception with 
Enhanced Logging
                 Key: MAPREDUCE-7486
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7486
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: mapreduce-client
    Affects Versions: 3.3.6
         Environment: Version: {{`3.3.6`}}
Location: 
{{{}`hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java`{}}},
 in {{`reportError`}} method, starting from Line 241-250.
            Reporter: LoggingResearch
         Attachments: TestYarnChild.java, original-vs-log-enhanced.md

The existing {{reportError}} method in {{YarnChild.java}} is responsible for 
handling exceptions during job execution. However, when the exception is due to 
the cluster storage capacity being exceeded, the method lacks sufficient 
logging, especially in cases where the job is not configured to fast fail. This 
can make it difficult for users to understand why a job did not fail 
immediately when the storage capacity was exceeded. The enhancement adds 
detailed logging to inform users about the configuration that prevents fast 
failure.
 
*Expected Behavior:* 
When a {{ClusterStorageCapacityExceededException}} is encountered, the system 
should log whether the job is configured to fail fast. If fast fail is 
disabled, the log should advise users on how to enable it.
 
*How-to-Fix:*
We propose to *expose such a relationship by logging.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Reply via email to