[jira] [Commented] (YARN-2468) Log handling for LRS

2016-07-25 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15391708#comment-15391708
 ] 

Lars Francke commented on YARN-2468:


I know this is an old issue but I can't find any documentation on the 
behaviour. It'd be fabulous if someone could summarise what the status is and 
what's changed here and which parts of it are configurable.

The very first post also mentions a LOG_DIR environment variable but I can't 
find any mention of it either.

So what do applications need to do with their Log files to be handled properly?

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.6.0
>
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-21 Thread Kannan Rajah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14179001#comment-14179001
 ] 

Kannan Rajah commented on YARN-2468:


Makes sense. Sorry, but I have just one last question, not completely relevant 
to this JIRA though. Is there any ongoing effort to write the logs directly to 
HDFS instead of this 2 phase approach? If not, can you point out the reasons? 
This work being done to take care of the lifecycle of these logs seem fairly 
complex and also potentially adding performance overhead to the cluster. So I 
am interested to understand the rationale.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.6.0
>
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-21 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178869#comment-14178869
 ] 

Xuan Gong commented on YARN-2468:
-

bq. Thanks. But what about the case where the file was modified. Let's say 10 
more lines were added to the syslog file. Doesn't it upload the full file again?

This is the pre-requirement: We will rely on user’s log application (such as 
log4j) to do the rollover for the logs. Users need to set up their log 
application properly. For our side, we upload every logs in our log dirs.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.6.0
>
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-21 Thread Kannan Rajah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178864#comment-14178864
 ] 

Kannan Rajah commented on YARN-2468:


Thanks. But what about the case where the file was modified. Let's say 10 more 
lines were added to the syslog file. Doesn't it upload the full file again?

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.6.0
>
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-21 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178809#comment-14178809
 ] 

Xuan Gong commented on YARN-2468:
-

[~rkannan82] 
bq. Xuan, I looked through the code changes and have a question about uploading 
logs for unfinished containers. Let's say we have already uploaded syslog for a 
container at time T1. At time T2, the container is still running and when the 
log aggregation is triggered again, will it re-upload the same syslog file? 
That seems to be the case.

It will not. EveryTime after we do the log aggregation, we will save the 
information for aggregated log file with (containerId.toString() + "_" + 
file.getName() + "_"+ file.lastModified()). So, in next run, before we start to 
upload logs, we will check the log file whether it exists in the 
savedAggregatedLogFileCache (uploadedFileMeta in AppLogAggregatorImpl), if it 
exists, we will skip. Otherwise, we will upload it.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.6.0
>
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-21 Thread Kannan Rajah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178794#comment-14178794
 ] 

Kannan Rajah commented on YARN-2468:


Xuan, I looked through the code changes and have a question about uploading 
logs for unfinished containers. Let's say we have already uploaded syslog for a 
container at time T1. At time T2, the container is still running and when the 
log aggregation is triggered again, will it re-upload the same syslog file? 
That seems to be the case.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.6.0
>
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159162#comment-14159162
 ] 

Hudson commented on YARN-2468:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1916 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1916/])
YARN-2468. Enhanced NodeManager to support log handling APIs (YARN-2569) for 
use by long running services. Contributed by Xuan Gong. (vinodkv: rev 
34cdcaad71cad76c0874a4e5266b4074009d2ffc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogAggregationUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/TestAggregatedLogsBlock.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java


> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.6.0
>
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159138#comment-14159138
 ] 

Hudson commented on YARN-2468:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1891 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1891/])
YARN-2468. Enhanced NodeManager to support log handling APIs (YARN-2569) for 
use by long running services. Contributed by Xuan Gong. (vinodkv: rev 
34cdcaad71cad76c0874a4e5266b4074009d2ffc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogAggregationUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/TestAggregatedLogsBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java


> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.6.0
>
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159031#comment-14159031
 ] 

Hudson commented on YARN-2468:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #700 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/700/])
YARN-2468. Enhanced NodeManager to support log handling APIs (YARN-2569) for 
use by long running services. Contributed by Xuan Gong. (vinodkv: rev 
34cdcaad71cad76c0874a4e5266b4074009d2ffc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/TestAggregatedLogsBlock.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogAggregationUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java


> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.6.0
>
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158398#comment-14158398
 ] 

Hudson commented on YARN-2468:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6187 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6187/])
YARN-2468. Enhanced NodeManager to support log handling APIs (YARN-2569) for 
use by long running services. Contributed by Xuan Gong. (vinodkv: rev 
34cdcaad71cad76c0874a4e5266b4074009d2ffc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/TestAggregatedLogsBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogAggregationUtils.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java


> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.6.0
>
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-03 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158385#comment-14158385
 ] 

Vinod Kumar Vavilapalli commented on YARN-2468:
---

Looks good, +1. Checking this in.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157619#comment-14157619
 ] 

Hadoop QA commented on YARN-2468:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12672717/YARN-2468.11.patch
  against trunk revision 054f285.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5248//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5248//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-02 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157588#comment-14157588
 ] 

Xuan Gong commented on YARN-2468:
-

bq. Also ContainerLogAggregator.uploadedFileMeta is also not needed to be a 
class member.

I think ContainerLogAggregator.uploadedFileMeta is needed to be a class member. 
It is used to keep track of all previous uploaded log files for each container. 
We will use this information to decide whether this log can be aggregated. 

New patch addressed other comments

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.11.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157291#comment-14157291
 ] 

Hadoop QA commented on YARN-2468:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12672626/YARN-2468.10.patch
  against trunk revision f679ca3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5244//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5244//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.2.patch, YARN-2468.3.patch, YARN-2468.3.rebase.2.patch, 
> YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, YARN-2468.4.patch, 
> YARN-2468.5.1.patch, YARN-2468.5.1.patch, YARN-2468.5.2.patch, 
> YARN-2468.5.3.patch, YARN-2468.5.4.patch, YARN-2468.5.patch, 
> YARN-2468.6.1.patch, YARN-2468.6.patch, YARN-2468.7.1.patch, 
> YARN-2468.7.patch, YARN-2468.8.patch, YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157199#comment-14157199
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12672626/YARN-2468.10.patch
  against trunk revision a56f3ec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5241//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5241//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.2.patch, YARN-2468.3.patch, YARN-2468.3.rebase.2.patch, 
> YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, YARN-2468.4.patch, 
> YARN-2468.5.1.patch, YARN-2468.5.1.patch, YARN-2468.5.2.patch, 
> YARN-2468.5.3.patch, YARN-2468.5.4.patch, YARN-2468.5.patch, 
> YARN-2468.6.1.patch, YARN-2468.6.patch, YARN-2468.7.1.patch, 
> YARN-2468.7.patch, YARN-2468.8.patch, YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-02 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157133#comment-14157133
 ] 

Xuan Gong commented on YARN-2468:
-

new patch addressed all the comments

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.10.patch, 
> YARN-2468.2.patch, YARN-2468.3.patch, YARN-2468.3.rebase.2.patch, 
> YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, YARN-2468.4.patch, 
> YARN-2468.5.1.patch, YARN-2468.5.1.patch, YARN-2468.5.2.patch, 
> YARN-2468.5.3.patch, YARN-2468.5.4.patch, YARN-2468.5.patch, 
> YARN-2468.6.1.patch, YARN-2468.6.patch, YARN-2468.7.1.patch, 
> YARN-2468.7.patch, YARN-2468.8.patch, YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-10-01 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156127#comment-14156127
 ] 

Zhijie Shen commented on YARN-2468:
---

bq. I would like to check how many log files we can upload this time. If the 
number is 0, we can skip this time. And this check is also happened before 
LogKey.write(), otherwise, we will write key, but without value.

I think Vinod meant that pendingUploadFiles is needed, but doesn't need to the 
member variable. getPendingLogFilesToUploadForThisContainer can return this 
collection, and pass it into LogValue.write by adding one param of it.

2. IMHO, the following code can be improved. If we use iterator, we can delete 
the unnecessary element on the fly.
{code}
  for (File file : candidates) {
Matcher fileMatcher = filterPattern.matcher(file.getName());
if (fileMatcher.find()) {
  filteredFiles.add(file);
}
  }
  if (!exclusion) {
return filteredFiles;
  } else {
candidates.removeAll(filteredFiles);
return candidates;
  }
{code}
This block could be:
{code}
...
while(candidatesItr.hasNext()) {
  candidate = candidatesItr.next();
  ...
  if ((not match && inclusive) || (match && exclusive)) {
candidatesItr.remove()
  } 
}
{code}

3. [~jianhe] mentioned to me before that the following condition is not always 
true to determine an AM container. Any idea? And it seems that we don't need 
shouldUploadLogsForRunningContainer, we can re-use shouldUploadLogs and set 
wasContainerSuccessful to true. Personally, if it's not trivial to identify the 
AM container, I prefer to write a TODO comment and leave it until we implement 
the log retention API.
{code}
  if (containerId.getId() == 1) {
return true;
  }
{code}

bq. It seems to be, let's validate this via a test-case.

Is it addressed by
{code}
this.conf.setLong(YarnConfiguration.DEBUG_NM_DELETE_DELAY_SEC, 3600);
{code}
Is it better to add a line of comment of the rationale behind the config?

5. Can the following code
{code}
Set finishedContainers = new HashSet();
for (ContainerId id : pendingContainerInThisCycle) {
  finishedContainers.add(id);
}
{code}
be simplified as
{code}
 Set finishedContainers = new 
HashSet(pendingContainerInThisCycle);
{code}

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152835#comment-14152835
 ] 

Hadoop QA commented on YARN-2468:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12671951/YARN-2468.9.1.patch
  against trunk revision 0577eb3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5181//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5181//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, 
> YARN-2468.9.1.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152717#comment-14152717
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12671923/YARN-2468.9.patch
  against trunk revision 0577eb3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5178//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5178//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-29 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152600#comment-14152600
 ] 

Xuan Gong commented on YARN-2468:
-

New patch addressed all other comments

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-29 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152599#comment-14152599
 ] 

Xuan Gong commented on YARN-2468:
-

bq. Why is the test in TestAggregatedLogsBlock ignored?

We will have YARN-2583 for web UI related changes. This test will be failed 
right now. So, I add @ignored

bq. pendingUploadFiles is really not neded to be a class field. Rename 
getNumOfLogFilesToUpload() to be getPendingLogFilesToUploadForThisContainer() 
and return the set of pending files. LogValue.write() can then take Set 
pendingLogFilesToUpload as one of the arguments.

I would like to check how many log files we can upload this time. If the number 
is 0, we can skip this time. And this check is also happened before 
LogKey.write(), otherwise, we will write key, but without value.

bq. If deletion of previously uploaded file takes a while and the file remains 
by the time of the next cycle, we will upload it again? It seems to be, let's 
validate this via a test-case.

No, it will not. That is why I saved many information, such as 
allExistingFiles, alreadyUploadedFiles and etc. We will those to check whether 
the logs have been uploaded before.

bq. testLogAggregationServiceWithInterval: doLogAggregationOutOfBand + 
Thread.sleep() is unreliable. Use a clock and refactor AppLogAggregatorImpl to 
have the cyclic aggregation directly callable via a method.

The Thread.sleep() is not used to trigger the logAggregation. It is used to 
make sure the logs has been uploaded into the remote directory. But, deleted 
those Thread.sleep() from the testcases.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch, YARN-2468.9.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150292#comment-14150292
 ] 

Hadoop QA commented on YARN-2468:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12671569/YARN-2468.8.patch
  against trunk revision 6b7673e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5161//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5161//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-26 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150266#comment-14150266
 ] 

Zhijie Shen commented on YARN-2468:
---

The test failures seems to be related to addressing binding conflicts. Restart 
a build.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150244#comment-14150244
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12671569/YARN-2468.8.patch
  against trunk revision 6b7673e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests:

  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  
org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManager
  
org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater
  
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch

  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests:

org.apache.hadoop.yarn.client.api.impl.TestAMRMClientOnRMRestart
org.apache.hadoop.yarn.server.TestContainerManageTests

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5159//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5159//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-26 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150221#comment-14150221
 ] 

Zhijie Shen commented on YARN-2468:
---

+1 for the latest patch

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch, YARN-2468.8.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-26 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150093#comment-14150093
 ] 

Xuan Gong commented on YARN-2468:
-

bq.  One question: if one file is failed at uploading in LogValue.write(), 
uploadedFiles will not reflect the missing uploaded file, and it will not be 
uploaded again?

Good catch. Fixed this in the new patch.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-25 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148834#comment-14148834
 ] 

Zhijie Shen commented on YARN-2468:
---

The patch is generally good. Some minor comments, and puzzles about the code.

1. The first one is \@VisibleForTesting? And the second one is not necessary?
{code}
-  private static String getNodeString(NodeId nodeId) {
+  public static String getNodeString(NodeId nodeId) {
 return nodeId.toString().replace(":", "_");
   }
-  
+
+  public static String getNodeString(String nodeId) {
+return nodeId.replace(":", "_");
+  }
{code}

2. Add a TODO to say the test will be fixed in a in followup Jira, in case we 
forget it?
{code}
+  @Ignore
   @Test
   public void testNoLogs() throws Exception {
{code}

3. Based on my understanding, uploadedFiles is the candidate files to upload? 
If so, can we rename the variables and related methods?
{code}
+private Set uploadedFiles = new HashSet();
{code}

4. I assume this var is going to capture all the existing log files on HDFS, 
isn't it? If so, the computation of it seems to be problematic, because it 
doesn't exclude the files to be excluded. And what's the effect on 
alreadyUploadedLogs?
{code}
+private Set allExistingFileMeta = new HashSet();
{code}
{code}
  Iterable mask =
  Iterables.filter(alreadyUploadedLogs, new Predicate() {
@Override
public boolean apply(String next) {
  return currentExistingLogFiles.contains(next);
}
  });
{code}

5. Make the old LogValue constructor based on the new one?

6. LogValue.write is not necessary to be changed?

7. It's recommended to close the Closable objects via IOUtils, but it seems 
that AggregatedLogFormat already has this issue before. Let's file a separate 
ticket for it.
{code}
+if (this.fsDataOStream != null) {
+  this.fsDataOStream.close();
+}
{code}

8. nodeId seems to be of no use. No need to be passed into AppLogAggregatorImpl.
{code}
+  private final NodeId nodeId;
{code}

9. remoteNodeLogDirForApp doesn't affect remoteNodeTmpLogFileForApp, which only 
depends on remoteNodeLogFileForApp. remoteNodeLogFileForApp is determined at 
construction, so remoteNodeTmpLogFileForApp should be final and computed once 
in constructor as well. And constructor param remoteNodeLogDirForApp should be 
renamed back to remoteNodeLogFileForApp.
{code}
-  private final Path remoteNodeTmpLogFileForApp;
+  private Path remoteNodeTmpLogFileForApp;
{code}
{code}
-  private Path getRemoteNodeTmpLogFileForApp() {
+  private Path getRemoteNodeTmpLogFileForApp(Path remoteNodeLogDirForApp) {
 return new Path(remoteNodeLogFileForApp.getParent(),
-(remoteNodeLogFileForApp.getName() + TMP_FILE_SUFFIX));
+  (remoteNodeLogFileForApp.getName() + 
LogAggregationUtils.TMP_FILE_SUFFIX));
   }
{code}

10. One typo
{code}
  // if any of the previous uoloaded logs have been deleted,
{code}

11. One question: if one file is failed at uploading in LogValue.write(), 
uploadedFiles will not reflect the missing uploaded file, and it will not be 
uploaded again?

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147238#comment-14147238
 ] 

Hadoop QA commented on YARN-2468:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12671109/YARN-2468.7.1.patch
  against trunk revision 72b0881.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5114//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5114//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-22 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143499#comment-14143499
 ] 

Xuan Gong commented on YARN-2468:
-

This is a very big patch and is hard for review. I would like to decouple the 
big patch to several smaller patches:
1) api changes will be tracked by YARN-2569
2) NMs need to find a way to get LogAggregationContext. This will be tracked by 
YARN-2581
3) Current ticket will be used to track changes for NM handling the logs for 
LRS which will include the log layout changes
4) Log Deletion Service changes will be tracked by YARN-2583
5) Related CL and web UI changes will be tracked by YARN-2582

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139619#comment-14139619
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12669809/YARN-2468.6.1.patch
  against trunk revision a337f0e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  org.apache.hadoop.yarn.client.cli.TestLogsCLI
  
org.apache.hadoop.yarn.client.api.impl.TestAMRMClientOnRMRestart

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5030//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5030//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139574#comment-14139574
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12669802/YARN-2468.6.patch
  against trunk revision 474f116.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.client.api.impl.TestAMRMClientOnRMRestart

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5029//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5029//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-18 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139515#comment-14139515
 ] 

Xuan Gong commented on YARN-2468:
-

bq. If we want to use "day" as the unit, shall we allow "double" interval, such 
that I can say 0.1 day. I think it would be useful for testing purpose (not 
unit test, but verifying the feature is working on purpose, and the app is 
using the feature correctly). Thoughts?

Add extra YarnConfiguration.DEBUG_LOG_AGGREGATION_SPEED_UP_RATIO for test/debug 
purpose.

New patch also address other comments

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-18 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138647#comment-14138647
 ] 

Zhijie Shen commented on YARN-2468:
---

1. Shall we say "if the log file name matches both the include and the exclude 
patterns, the file will be excluded eventually"?
{code}
+ * filter the log files. The log files which match the include
+ * patterns will not be uploaded. If any patterns are defined in
+ * both logIncludePatterns and logExcludePatterns, those files matched
+ * with these patterns will not be uploaded.
{code}

2. Except this method, other new APIs are marked \@Stable. Is it a bit 
aggressive as it is not a simple new feature?
{code}
+public abstract class LogContext {
+
+  @Public
+  @Unstable
+  public static LogContext newInstance(Set logIncludePatterns,
{code}

3. Is ACCEPTED or SUBMITTED to early to say the log is available? The AM 
container my not have been launched yet.
{code}
   switch (appReport.getYarnApplicationState()) {
   case NEW:
   case NEW_SAVING:
+return -1;
   case ACCEPTED:
   case SUBMITTED:
   case RUNNING:
-return -1;
{code}

4. numOflogFiles -> getNumOfLogFilesToUpload?

5. The following code always executes no matter the include patterns are 
applied or not. getFilteredLogFiles log seem to be improvable.
{code}
  Set candiates =
  new HashSet(Arrays.asList(containerLogDir.listFiles()));
{code}

6. Why not just tmp suffix?
{code}
  public static final String TMP_FILE = "TEMP.tmp";
{code}

7. Should you call getRemoteNodeLogDirForApp here, though both methods seem to 
do the same thing?
{code}
+Path remoteAppNodeLogDir = LogAggregationUtils.getRemoteNodeLogFileForApp(
{code}

8. In this case should we return -1?
{code}
if (!foundContainerLogs) {
  System.out.println("Logs for container " + containerId
  + " are not present in this log-file.");
}
{code}

9. Do you want to remove getRemoteNodeLogFileForApp (in LogAggregationUtils and 
LogAggregationService), which is not used by production code any more, and 
getRemoteNodeLogDirForApp is doing the identical stuff.

10. Not right variable name convention.
{code}
Path node_appDir = LogAggregationUtils.getRemoteNodeLogDirForApp(
{code}
{code}
+long time_stamp = 0;
{code}
{code}
+while (count <= max_attempt) {
{code}

11. Change appLogAggregators to public directly?
{code}
  @Private
  @VisibleForTesting
  public ConcurrentMap getAppLogAggregators() {
return this.appLogAggregators;
  }
{code}

12. In testLogAggregationServiceWithInterval, round 1, 2, 3's logic seems to be 
similar. It is possible to have iteration of i = 1 -> 3?

bq. 3. Is the "day" unit too coarse?

If we want to use "day" as the unit, shall we allow "double" interval, such 
that I can say 0.1 day. I think it would be useful for testing purpose (not 
unit test, but verifying the feature is working on purpose, and the app is 
using the feature correctly). Thoughts?

bq. I think that we can solve this problem in separate ticket, because this 
ticket is the first step to solve Log handling for LRS.

Sounds good. It shouldn't be a common case, unless we config to delay the log 
deletion for debugging purpose. Please file a Jira about it.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138470#comment-14138470
 ] 

Hadoop QA commented on YARN-2468:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12669626/YARN-2468.5.4.patch
  against trunk revision 10e8602.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5022//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5022//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138439#comment-14138439
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12669611/YARN-2468.5.3.patch
  against trunk revision 47e5e19.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5020//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5020//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138436#comment-14138436
 ] 

Hadoop QA commented on YARN-2468:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12669611/YARN-2468.5.3.patch
  against trunk revision 47e5e19.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5019//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5019//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138411#comment-14138411
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12669602/YARN-2468.5.2.patch
  against trunk revision 47e5e19.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5018//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138405#comment-14138405
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12669602/YARN-2468.5.2.patch
  against trunk revision 47e5e19.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5017//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138127#comment-14138127
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12669521/YARN-2468.5.1.patch
  against trunk revision f230248.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5007//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5007//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, YARN-2468.5.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137740#comment-14137740
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12669460/YARN-2468.5.patch
  against trunk revision d9a8603.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 3 
warning messages.
See 
https://builds.apache.org/job/PreCommit-YARN-Build/4995//artifact/trunk/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4995//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4995//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-17 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137661#comment-14137661
 ] 

Xuan Gong commented on YARN-2468:
-

bq. If LogContext is not specified, we're running into the traditional log 
handling case, right? We will still have a combined log file identified by the 
node id? Or node id will always be the directory, and there exists only one 
file under it?

node id will always be the directory, and there exists only one file under it

bq. Let's say if work-preserving NM restarting happens, NM is going to forget 
all the uploaded logs files, and redo everything, right?

If NM restarts happens, it will upload all logs which are previous uploaded, 
but not deleted. 
I think that we can solve this problem in separate ticket, because this ticket 
is the first step to solve Log handling for LRS. 

bq. LogContext doesn't need to be in ApplicatonSubmissionContext, because 
ApplicatonSubmissionContext contains ContainerLaunchContext. LogContext is 
container related stuff, such that ContainerLaunchContext should be the best 
place. Concurrently, we can have one context for all containers. Maybe in the 
future we can think of setting different LogContext for each individual 
container.

DONE

bq. In getFilteredLogFiles, the logic is that if the log file matches the 
include pattern, it will be added first, and if then if it matches the exclude 
pattern, it will be removed. Shall we do the sanity check to make sure we can 
not include and exclude the same pattern, otherwise, the semantics is a bit 
weird.

Add more explanation in javaDoc.

Uploaded a new patch to address all comments.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-16 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136869#comment-14136869
 ] 

Zhijie Shen commented on YARN-2468:
---

[~xgong], thanks for the patch. I agree with the solution in general. Here're 
some comments so far about the patch. I may post more after a second scan.

1. LogContext doesn't need to be in ApplicatonSubmissionContext, because 
ApplicatonSubmissionContext contains ContainerLaunchContext. LogContext is 
container related stuff, such that ContainerLaunchContext should be the best 
place. Concurrently, we can have one context for all containers. Maybe in the 
future we can think of setting different LogContext for each individual 
container.

2. Shouldn't be "exclude" patterns?
{code}
+ * logExcludePattern. It defines include patterns which is used to
+ * filter the log files. The log files which match the include
+ * patterns will not be uploaded.
{code}

3. Is the "day" unit too coarse?
{code}
+ * how often the logAggregationSerivce uploads container logs in days
{code}

4. The message is a bit confusing. The application is not running or completed? 
Then what's the app state?
{code}
+System.out.println("Application is not running or completed."
++ " Logs are only available when an application is running"
++ " or completes")
{code}

5. Why not logContext = p.getLogContext() directly?
{code}
+LogContext logContext = null;
+if (p.getLogContext() != null) {
+  logContext = new LogContextPBImpl(p.getLogContext());
+}
{code}

6. Should the method be public?
{code}
public int numOflogFiles() {
{code}

7. In getFilteredLogFiles, the logic is that if the log file matches the 
include pattern, it will be added first, and if then if it matches the exclude 
pattern, it will be removed. Shall we do the sanity check to make sure we can 
not include and exclude the same pattern, otherwise, the semantics is a bit 
weird.

8. The following code doesn't need to be executed by the doas ugi?
{code}
  this.remoteAppLogFile = remoteAppLogFile;
  fc = FileContext.getFileContext(conf);
  fc.setUMask(APP_LOG_FILE_UMASK);
{code}

9. Changes in dumpAllContainersLogs just are just reformatting? If so, shall we 
undo it?

10. Is it okay that we have removed the following code to close the writer?
{code}
if (this.writer != null) {
  this.writer.close();
  LOG.info("Finished aggregate log-file for app " + this.applicationId);
}
{code}

And I've two general questions:

1. If LogContext is not specified, we're running into the traditional log 
handling case, right? We will still have a combined log file identified by the 
node id? Or node id will always be the directory, and there exists only one 
file under it?

2. Let's say if work-preserving NM restarting happens, NM is going to forget 
all the uploaded logs files, and redo everything, right?

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136483#comment-14136483
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12669227/YARN-2468.4.1.patch
  against trunk revision 3e85f5b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.logaggregation.TestAggregatedLogsBlock
  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4979//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4979//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135226#comment-14135226
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12669005/YARN-2468.4.patch
  against trunk revision 7e08c0f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4968//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4968//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135095#comment-14135095
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12668986/YARN-2468.3.rebase.2.patch
  against trunk revision 7e08c0f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService
  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4967//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4967//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134257#comment-14134257
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12668794/YARN-2468.3.rebase.patch
  against trunk revision 24d920b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService
  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4959//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4959//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14132193#comment-14132193
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668465/YARN-2468.3.patch
  against trunk revision 54e5794.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1265 javac 
compiler warnings (more than the trunk's current 1011 warnings).

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
109 warning messages.
See 
https://builds.apache.org/job/PreCommit-YARN-Build/4942//artifact/trunk/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4942//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/4942//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4942//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14132012#comment-14132012
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668421/YARN-2468.2.patch
  against trunk revision 3122daa.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 4 
warning messages.
See 
https://builds.apache.org/job/PreCommit-YARN-Build/4934//artifact/trunk/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4934//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/4934//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4934//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-12 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131892#comment-14131892
 ] 

Xuan Gong commented on YARN-2468:
-

Did more investigations and offline discussions. It turns out this is a really 
hard problem. So, we decide to solve this step by step. 

For the first step, we will stick to the original proposal: change the log 
layout, create a directory (named as node id of the NM), under this directory, 
every time when AppLogAggregatorImpl starts to upload container logs; it will 
create a file (named as node_id + timestamp). 
This method will increase the number of log files, but it will work fine for a 
small cluster. 

For the next step, we need to find a better way to handle the logs more 
efficiently. We would like to aggregate all containers’ log (Those containers 
are belong to the same NM) in a single file. In that case, the total number of 
logs is bounded. But we need find more scalable way, other than TFile, to do 
it. Will open a separate ticket for this.


> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-04 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121778#comment-14121778
 ] 

Zhijie Shen commented on YARN-2468:
---

I'm afraid it may be not fair to compare the log file creation between a single 
long running service and a short-term application.

I'm thinking about the file problem in a different direction.  Let's see how 
many log files will be created for a YARN cluster. For example, a long running 
service takes 10% resource from the cluster, and runs for 10 days. On each day, 
it will spawn out 1 log file per day. On the other side, for example, a normal 
application also takes 10% resource from the cluster, runs for 1 days, and 
spawn out 1 log file. Suppose the application will be started every day. Over 
10 days, the number of spawned logs of both the long running service and the 10 
iterations of the application is 10.

So from the point of view of the cluster, the number of logs is proportional to 
the resource usage instead of the application number. The similar resource 
usage may result in the similar number of log files. The case may not becoming 
even worse if we take the whole cluster into account. However, I agree we loose 
the opportunity to even make a long running service to use a single log file, 
reducing the total log file number.

To completely resolve the too-many-files problem, we make think of timeline 
server, which has the store layer to deal with the real I/O on your behalf. 
Another optimization may be log retention, I'm not sure the feature already 
exists or have been proposed together in this solution.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-02 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118519#comment-14118519
 ] 

Vinod Kumar Vavilapalli commented on YARN-2468:
---

The overall proposal is fine.

bq. 3. Change the log layout. Currently, all container logs (for the same NM) 
will be written into a single file. This does not work for LRS. We will create 
a directory (named as node id of the NM), under this directory, every time when 
AppLogAggregatorImpl starts to upload container logs; it will create a file 
(named as node_id + timestamp). 
We already have the too-many-files problem, this solution will make it worse. 
I'll give that this is a hard problem to solve - need to think about it a 
little more.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114777#comment-14114777
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12665186/YARN-2468.1.patch
  against trunk revision d8774cc.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to cause Findbugs 
(version 2.0.3) to fail.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4761//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4761//console

This message is automatically generated.

> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-08-28 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114754#comment-14114754
 ] 

Xuan Gong commented on YARN-2468:
-

We will rely on user’s log application (such as log4j) to do the rollover for 
the logs. We have already exposed an environment variable LOG_DIR that the 
users can use to set up their log application.  We will provide a log service 
that will upload all the qualified logs periodically. 

Several changes in this patch:
1. Create a logContext (add into ApplicationSubmissionContext), which includes 
include_patterns, exclude_patterns (Those are used to filter the logs) and 
intervals (defines how often log aggregation service uploads the container 
logs).
2. AppLogAggregatorImpl will upload container logs periodically instead of 
waiting for the application to finish.
3. Change the log layout. Currently, all container logs (for the same NM) will 
be written into a single file. This does not work for LRS. We will create a 
directory (named as node id of the NM), under this directory, every time when 
AppLogAggregatorImpl starts to upload container logs; it will create a file 
(named as node_id + timestamp).  


> Log handling for LRS
> 
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.2#6252)