[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2019-03-12 Thread Tarun Parimi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790508#comment-16790508
 ] 

Tarun Parimi commented on YARN-8617:


Hi [~bibinchundatt],

I was also facing this issue and on testing in my local cluster I observed the 
follows:

{quote}1. limit number of files per node
public static final String NM_LOG_AGGREGATION_NUM_LOG_FILES_SIZE_PER_APP
= NM_PREFIX + "log-aggregation.num-log-files-per-app";{quote}
This doesn't seem to work currently for IndexedFileFormat. After the file 
exceeds LOG_ROLL_OVER_MAX_FILE_SIZE_GB, a new file is created. But the older 
node files can keep on accumulating as long as the app is running. Should we 
implement this config for IndexedFileFormat also as a fix?

{quote}For long running service the application folder eg 
:user/logs/application_1234 modification time gets updated on every upload 
cycle.
This could cause nodefile to remain in hdfs if no new containers are allocated 
to same node.{quote}
Should we check and delete nodefiles in AggrgeatedLogDeletionService  for 
RUNNING apps without the checking the condition appDir.getModificationTime() < 
cutoffMillis ? 
Doing so will delete the older node files and fix the problem of old node files 
getting accumulated.



> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-12-10 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714696#comment-16714696
 ] 

Bibin A Chundatt commented on YARN-8617:


[~Prabhu Joseph]

Looked into the issue again ..YARN-2583 contains 2 parts 

# limit number of files per node
  public static final String NM_LOG_AGGREGATION_NUM_LOG_FILES_SIZE_PER_APP
  = NM_PREFIX + "log-aggregation.num-log-files-per-app";
# Delete files old than expiry time.

{code}
if (appDir.isDirectory() &&
appDir.getModificationTime() < cutoffMillis) {
{code}

{quote}
The AggrgeatedLogDeletionService does deletion for Running Job based upon the 
file modification time which always will be latest as the rolled logs are 
getting updated into the node1 file regularly
{quote}

For long running service the *application folder* eg 
:user/logs/application_1234 modification time gets updated on every upload 
cycle.
This could cause nodefile to remain in hdfs if no new containers are allocated 
to same node.



> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-06 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570773#comment-16570773
 ] 

Prabhu Joseph commented on YARN-8617:
-

Thanks [~bibinchundatt]. Still analyzing the issue in our cluster, will open 
this later if needed.

> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-06 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570205#comment-16570205
 ] 

Bibin A Chundatt commented on YARN-8617:


[~Prabhu Joseph]

For IndexFileFormat the rolling is based in size 

{code}
  @Private
  @VisibleForTesting
  public long getRollOverLogMaxSize(Configuration conf) {
return 1024L * 1024 * 1024 * conf.getInt(
LOG_ROLL_OVER_MAX_FILE_SIZE_GB, 10);
  }
{code}

> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-06 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569983#comment-16569983
 ] 

Bibin A Chundatt commented on YARN-8617:


[~Prabhu Joseph]

IIUC LogAggregationService ->  AggregatorImpl every upload cycle for each node 
you will have new files per node

{code}
node3.openstack_45454_1533315091898
node3.openstack_45454_
node3.openstack_45454_
{code}

{{node3.openstack_45454_1533315091898}} file will not be changed once uploaded. 
So AggregationDeletionService should delete the file after log retain time


{code}
   Set uploadedFilePathsInThisCycle =
aggregator.doContainerLogAggregation(logAggregationFileController,
appFinished, finishedContainers.contains(container));
...

  logControllerContext.setLogUploadTimeStamp(System.currentTimeMillis());

  try {
this.logAggregationFileController.postWrite(logControllerContext);
diagnosticMessage = "Log uploaded successfully for Application: "
+ appId + " in NodeManager: "
+ LogAggregationUtils.getNodeString(nodeId) + " at "
+ Times.format(logControllerContext.getLogUploadTimeStamp())
+ "\n";
{code}

this.logAggregationFileController.postWrite(logControllerContext); --> renames 
the file with time stamp




> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-03 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568742#comment-16568742
 ] 

Prabhu Joseph commented on YARN-8617:
-

[~bibinchundatt]  
yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds is already 
set to 3600. Will explain the issue clearly.

A Long Running Container runs on node1 writes the daemon.log and rotates - 
daemon.log.1, daemon.log.2 etc, Yarn Log Aggregation happens for rolled log 
files as expected and combines all files for that node and place it under hdfs 
path in a log file called node1 file. 

{code}
[hdfs@node2 tmp]$ hadoop fs -ls  
/app-logs/hive/logs-ifile/application_1533261669437_0001/node3.openstack_45454_1533315091898
 
-rw-r-   3 hive hadoop  29207 2018-08-03 19:51 
/app-logs/hive/logs-ifile/application_1533261669437_0001/node3.openstack_45454_1533315091898
{code}

The AggrgeatedLogDeletionService does deletion for Running Job based upon the 
file modification time which always will be latest as the rolled logs are 
getting updated into the node1 file regularly. This does not delete the older 
log contents daemon.log.2 etc part of node1 file. This will cause the node1 
file to accumulate when a container is always going to be running on that node.

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L129

{code}
 for (FileStatus node : logFiles) {
  if (node.getModificationTime() < cutoffMillis) {
try {
  fs.delete(node.getPath(), true);
} catch (IOException ex) {
  logException("Could not delete " + appDir.getPath(), ex);
}
}
{code}


> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-03 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568316#comment-16568316
 ] 

Prabhu Joseph commented on YARN-8617:
-

Yes [~bibinchundatt], YARN-2583 matches what am expecting. Will test with a non 
negative value for 
yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds. Thanks a lot.

> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-03 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568306#comment-16568306
 ] 

Bibin A Chundatt commented on YARN-8617:


Jira id YARN-2583

> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-03 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568285#comment-16568285
 ] 

Prabhu Joseph commented on YARN-8617:
-

Yes, the log files from NM local directory will be removed. But i don't see the 
log files which are aggregated and placed in Hdfs path 
yarn.nodemanager.remote-app-log-dir getting removed after 
yarn.log-aggregation.retain-seconds by JHS when the job is *RUNNING*. It 
deletes only for the completed apps.

> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-03 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568261#comment-16568261
 ] 

Bibin A Chundatt commented on YARN-8617:


{{AppLogAggregatorImpl#uploadLogsForContainers}} all succesfully uploaded logs 
will be deleted too.

{code}
Set uploadedFilePathsInThisCycle =
aggregator.doContainerLogAggregation(logAggregationFileController,
appFinished, finishedContainers.contains(container));
if (uploadedFilePathsInThisCycle.size() > 0) {
  uploadedLogsInThisCycle = true;
  List uploadedFilePathsInThisCycleList = new ArrayList<>();
  uploadedFilePathsInThisCycleList.addAll(uploadedFilePathsInThisCycle);
  deletionTask = new FileDeletionTask(delService,
  this.userUgi.getShortUserName(), null,
  uploadedFilePathsInThisCycleList);
}
{code}

> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-03 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568234#comment-16568234
 ] 

Prabhu Joseph commented on YARN-8617:
-

[~bibinchundatt] Will this also remove the aggregated log files from 
yarn.nodemanager.remote-app-log-dir when the job is running.

> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-03 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567967#comment-16567967
 ] 

Bibin A Chundatt commented on YARN-8617:


[~Prabhu Joseph] 

For long running jobs we can enable rolling aggregation at nodemanagers.
Configure non negative value for 
{{yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds}} . Default 
value is -1.

> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org