[jira] [Updated] (YARN-11188) Only files belong to the first file controller are removed even if multiple log aggregation file controllers are configured
[ https://issues.apache.org/jira/browse/YARN-11188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11188: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Only files belong to the first file controller are removed even if multiple > log aggregation file controllers are configured > --- > > Key: YARN-11188 > URL: https://issues.apache.org/jira/browse/YARN-11188 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 3.4.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Log aggregation can be configured to have a comma-separated list of file > controllers. > The current behaviour only removes files that belong to the first file > controller. > This can be problematic. > For example, if some user configures IFile as the file controller, and later > on changes the file controllers to specify multiple file controllers (e.g. > value = TFile,IFile) then only the first controller will be considered and > the files belong to that controller will be removed, in this case files > written by the TFile controller will be removed and the files created with > the IFile controller will be kept. > This behaviour should be changed so that all of the files should be removed > if multiple file controllers are enabled. > h2. CODE PATH > > 1. > [AggregatedLogDeletionService$LogDeletionTask#run|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L82-L108]: > > Let's understand what does this method do. > 1.1 An important bit is to see how the value of the field called > 'retentionMillis' is set. In the constructor of LogDeletionTask, there's an > incoming parameter called 'retentionSecs' that is just multiplied by 1000 to > have a millisecond value. > Let's see where 'retentionSecs' is coming from. > 1.2 > [AggregatedLogDeletionService#scheduleLogDeletionTask|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L258-L283] > that sets the value of retentionSecs. > The config key for this value is 'yarn.log-aggregation.retain-seconds'. > The javadoc says: "How long to wait before deleting aggregated logs, -1 > disables. Be careful set this too small and you will spam the name node." > 1.3 Going back to > [https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L82-L108], > the 'cutOffMillis' value is computed by getting the current time in millis > minus the retentionMillis. > 1.4 The main point of this method is to iterate over the files in the remote > root log dir (field called 'remoteRootLogDir') and to check if it is a > directory. If so, a new Path is created with that particular directory ([code > link|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L90-L96]). > One more important thing to mention: There's a field called 'suffix' that is > added to the remote root log dir path. > Let's check how the 'remoteRootLogDir' and 'suffix' field get its value as > this is crucial to understand how the log dirs are deleted. > 1.5 remoteRootLogDir is set in the constructor of LogDeletionTask, > [here|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L77]. > The value is returned by calling fileController.getRemoteRootLogDir(). > The LogAggregationFileControllerFactory creates the instance of > LogAggregationFileController. > > *The process of determining the log aggregation file controller is quite > messy, let me describe this in detail.* > *There are 2 types of file controllers: LogAggregationIndexedFileController > and LogAggregationTFileController* > *There's a testcase called > [TestLogAggregationFileControllerFactory#testLogAggregationFileControllerFactory|#testLogAggregationFileControllerFactory] > that shows how the LogAggregationFileControllerFactory is configured.* > 2.1
[jira] [Updated] (YARN-11188) Only files belong to the first file controller are removed even if multiple log aggregation file controllers are configured
[ https://issues.apache.org/jira/browse/YARN-11188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated YARN-11188: -- Labels: pull-request-available (was: ) > Only files belong to the first file controller are removed even if multiple > log aggregation file controllers are configured > --- > > Key: YARN-11188 > URL: https://issues.apache.org/jira/browse/YARN-11188 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Log aggregation can be configured to have a comma-separated list of file > controllers. > The current behaviour only removes files that belong to the first file > controller. > This can be problematic. > For example, if some user configures IFile as the file controller, and later > on changes the file controllers to specify multiple file controllers (e.g. > value = TFile,IFile) then only the first controller will be considered and > the files belong to that controller will be removed, in this case files > written by the TFile controller will be removed and the files created with > the IFile controller will be kept. > This behaviour should be changed so that all of the files should be removed > if multiple file controllers are enabled. > h2. CODE PATH > > 1. > [AggregatedLogDeletionService$LogDeletionTask#run|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L82-L108]: > > Let's understand what does this method do. > 1.1 An important bit is to see how the value of the field called > 'retentionMillis' is set. In the constructor of LogDeletionTask, there's an > incoming parameter called 'retentionSecs' that is just multiplied by 1000 to > have a millisecond value. > Let's see where 'retentionSecs' is coming from. > 1.2 > [AggregatedLogDeletionService#scheduleLogDeletionTask|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L258-L283] > that sets the value of retentionSecs. > The config key for this value is 'yarn.log-aggregation.retain-seconds'. > The javadoc says: "How long to wait before deleting aggregated logs, -1 > disables. Be careful set this too small and you will spam the name node." > 1.3 Going back to > [https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L82-L108], > the 'cutOffMillis' value is computed by getting the current time in millis > minus the retentionMillis. > 1.4 The main point of this method is to iterate over the files in the remote > root log dir (field called 'remoteRootLogDir') and to check if it is a > directory. If so, a new Path is created with that particular directory ([code > link|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L90-L96]). > One more important thing to mention: There's a field called 'suffix' that is > added to the remote root log dir path. > Let's check how the 'remoteRootLogDir' and 'suffix' field get its value as > this is crucial to understand how the log dirs are deleted. > 1.5 remoteRootLogDir is set in the constructor of LogDeletionTask, > [here|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L77]. > The value is returned by calling fileController.getRemoteRootLogDir(). > The LogAggregationFileControllerFactory creates the instance of > LogAggregationFileController. > > *The process of determining the log aggregation file controller is quite > messy, let me describe this in detail.* > *There are 2 types of file controllers: LogAggregationIndexedFileController > and LogAggregationTFileController* > *There's a testcase called > [TestLogAggregationFileControllerFactory#testLogAggregationFileControllerFactory|#testLogAggregationFileControllerFactory] > that shows how the LogAggregationFileControllerFactory is configured.* > 2.1 First, some important configs: > 2.1.1
[jira] [Updated] (YARN-11188) Only files belong to the first file controller are removed even if multiple log aggregation file controllers are configured
[ https://issues.apache.org/jira/browse/YARN-11188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-11188: -- Summary: Only files belong to the first file controller are removed even if multiple log aggregation file controllers are configured (was: Only files belong to the first first file controller are removed even if multiple log aggregation file controllers are configured) > Only files belong to the first file controller are removed even if multiple > log aggregation file controllers are configured > --- > > Key: YARN-11188 > URL: https://issues.apache.org/jira/browse/YARN-11188 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Fix For: 3.4.0 > > > Log aggregation can be configured to have a comma-separated list of file > controllers. > The current behaviour only removes files that belong to the first file > controller. > This can be problematic. > For example, if some user configures IFile as the file controller, and later > on changes the file controllers to specify multiple file controllers (e.g. > value = TFile,IFile) then only the first controller will be considered and > the files belong to that controller will be removed, in this case files > written by the TFile controller will be removed and the files created with > the IFile controller will be kept. > This behaviour should be changed so that all of the files should be removed > if multiple file controllers are enabled. > h2. CODE PATH > > 1. > [AggregatedLogDeletionService$LogDeletionTask#run|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L82-L108]: > > Let's understand what does this method do. > 1.1 An important bit is to see how the value of the field called > 'retentionMillis' is set. In the constructor of LogDeletionTask, there's an > incoming parameter called 'retentionSecs' that is just multiplied by 1000 to > have a millisecond value. > Let's see where 'retentionSecs' is coming from. > 1.2 > [AggregatedLogDeletionService#scheduleLogDeletionTask|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L258-L283] > that sets the value of retentionSecs. > The config key for this value is 'yarn.log-aggregation.retain-seconds'. > The javadoc says: "How long to wait before deleting aggregated logs, -1 > disables. Be careful set this too small and you will spam the name node." > 1.3 Going back to > [https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L82-L108], > the 'cutOffMillis' value is computed by getting the current time in millis > minus the retentionMillis. > 1.4 The main point of this method is to iterate over the files in the remote > root log dir (field called 'remoteRootLogDir') and to check if it is a > directory. If so, a new Path is created with that particular directory ([code > link|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L90-L96]). > One more important thing to mention: There's a field called 'suffix' that is > added to the remote root log dir path. > Let's check how the 'remoteRootLogDir' and 'suffix' field get its value as > this is crucial to understand how the log dirs are deleted. > 1.5 remoteRootLogDir is set in the constructor of LogDeletionTask, > [here|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L77]. > The value is returned by calling fileController.getRemoteRootLogDir(). > The LogAggregationFileControllerFactory creates the instance of > LogAggregationFileController. > > *The process of determining the log aggregation file controller is quite > messy, let me describe this in detail.* > *There are 2 types of file controllers: LogAggregationIndexedFileController > and LogAggregationTFileController* > *There's a testcase called >