[jira] [Updated] (YARN-3897) "Too many links" in NM log dir
[ https://issues.apache.org/jira/browse/YARN-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Zhiguo updated YARN-3897: -- Description: Users need to left container logs more than one day. On some nodes of our busy cluster, the number of subdirs of {yarn.nodemanager.log-dirs} may reach 32000, which is the defaul limit of ext3 file system. As a result, we got errors when initiating containers: "Failed to create directory {yarn.nodemanager.log-dirs}/application_1435111082717_1341740 - Too many links" log aggregation is not an option for us because of the heavy pressure on namenode. With a cluster of 5K nodes and 20k log files per node, it's not acceptable to aggregate so many files to hdfs. Since ext3 is still widely used, we'd better do something to avoid such error. was: Users need to left container logs more than one day. On some nodes of our busy cluster, the number of subdirs of {yarn.nodemanager.log-dirs} may reach 32000, which is the defaul limit of ext3 file system. As a result, we got errors when initiating containers: "Failed to create directory {yarn.nodemanager.log-dirs}/logs/application_1435111082717_1341740 - Too many links" log aggregation is not an option for us because of the heavy pressure on namenode. With a cluster of 5K nodes and 20k log files per node, it's not acceptable to aggregate so many files to hdfs. Since ext3 is still widely used, we'd better do something to avoid such error. > "Too many links" in NM log dir > -- > > Key: YARN-3897 > URL: https://issues.apache.org/jira/browse/YARN-3897 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Hong Zhiguo >Assignee: Hong Zhiguo >Priority: Minor > > Users need to left container logs more than one day. On some nodes of our > busy cluster, the number of subdirs of {yarn.nodemanager.log-dirs} may reach > 32000, which is the defaul limit of ext3 file system. As a result, we got > errors when initiating containers: > "Failed to create directory > {yarn.nodemanager.log-dirs}/application_1435111082717_1341740 - Too many > links" > log aggregation is not an option for us because of the heavy pressure on > namenode. With a cluster of 5K nodes and 20k log files per node, it's not > acceptable to aggregate so many files to hdfs. > Since ext3 is still widely used, we'd better do something to avoid such error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3897) "Too many links" in NM log dir
[ https://issues.apache.org/jira/browse/YARN-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Zhiguo updated YARN-3897: -- Description: Users need to left container logs more than one day. On some nodes of our busy cluster, the number of subdirs of {yarn.nodemanager.log-dirs} may reach 32000, which is the defaul limit of ext3 file system. As a result, we got errors when initiating containers: "Failed to create directory {yarn.nodemanager.log-dirs}/logs/application_1435111082717_1341740 - Too many links" log aggregation is not an option for us because of the heavy pressure on namenode. With a cluster of 5K nodes and 20k log files per node, it's not acceptable to aggregate so many files to hdfs. Since ext3 is still widely used, we'd better do something to avoid such error. was: Users need to left container logs more than one day. On some nodes of our busy cluster, the number of subdirs of {yarn.nodemanager.log-dirs} may reach 32000, which is the defaul limit of ext3 file system. As a result, we got errors when initiating containers: "Failed to create directory {yarn.nodemanager.log-dirs}/logs/application_1435111082717_1341740 - Too many links" log aggregation is not an option for us because of the heavy pressure on namenode. With a cluster of 5K nodes and 20k log files per node, it's not acceptable to aggregate some many files to hdfs. Since ext3 is still widely used, we'd better do something to avoid such error. > "Too many links" in NM log dir > -- > > Key: YARN-3897 > URL: https://issues.apache.org/jira/browse/YARN-3897 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Hong Zhiguo >Assignee: Hong Zhiguo >Priority: Minor > > Users need to left container logs more than one day. On some nodes of our > busy cluster, the number of subdirs of {yarn.nodemanager.log-dirs} may reach > 32000, which is the defaul limit of ext3 file system. As a result, we got > errors when initiating containers: > "Failed to create directory > {yarn.nodemanager.log-dirs}/logs/application_1435111082717_1341740 - Too many > links" > log aggregation is not an option for us because of the heavy pressure on > namenode. With a cluster of 5K nodes and 20k log files per node, it's not > acceptable to aggregate so many files to hdfs. > Since ext3 is still widely used, we'd better do something to avoid such error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)