Arpit Agarwal created HADOOP-15450: -------------------------------------- Summary: Avoid fsync storm triggered by DiskChecker and handle disk full situation Key: HADOOP-15450 URL: https://issues.apache.org/jira/browse/HADOOP-15450 Project: Hadoop Common Issue Type: Bug Reporter: Arpit Agarwal Assignee: Arpit Agarwal
Fix disk checker issues reported by [~kihwal] in HADOOP-13738: 1. When space is low, the os returns ENOSPC. Instead simply stop writing, the drive is marked bad and replication happens. This make cluster-wide space problem worse. If the number of "failed" drives exceeds the DFIP limit, the datanode shuts down. 1. There are non-hdfs users of DiskChecker, who use it proactively, not just on failures. This was fine before, but now it incurs heavy I/O due to introduction of fsync() in the code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org