[jira] [Commented] (MAPREDUCE-6468) Consistent log severity level guards and statements

2015-09-08 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734688#comment-14734688
 ] 

Tsuyoshi Ozawa commented on MAPREDUCE-6468:
---

+1, checking this in.

> Consistent log severity level guards and statements 
> 
>
> Key: MAPREDUCE-6468
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6468
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jackie Chang
>Assignee: Jagadesh Kiran N
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-9995-00.patch, HADOOP-9995.patch, 
> MAPREDUCE-6468-01.patch
>
>
> Developers use logs to do in-house debugging. These log statements are later 
> demoted to less severe levels and usually are guarded by their matching 
> severity levels. However, we do see inconsistencies in trunk. A log statement 
> like 
> {code}
>if (LOG.isDebugEnabled()) {
> LOG.info("Assigned container (" + allocated + ") "
> {code}
> doesn't make much sense because the log message is actually only printed out 
> in DEBUG-level. We do see previous issues tried to correct this 
> inconsistency. I am proposing a comprehensive correction over trunk.
> Doug Cutting pointed it out in HADOOP-312: 
> https://issues.apache.org/jira/browse/HADOOP-312?focusedCommentId=12429498=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12429498
> HDFS-1611 also corrected this inconsistency.
> This could have been avoided by switching from log4j to slf4j's {} format 
> like CASSANDRA-625 (2010/3) and ZOOKEEPER-850 (2012/1), which gives cleaner 
> code and slightly higher performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6468) Consistent log severity level guards and statements in RMContainerAllocator

2015-09-08 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734697#comment-14734697
 ] 

Tsuyoshi Ozawa commented on MAPREDUCE-6468:
---

[~jagadesh.kiran] Before checking this in, I've checked initial patch by 
Jackie. We should fix the inconsistency of log levels here in other files also.

1. RMContainerRequestor
{code}
if (LOG.isDebugEnabled()) {
  LOG.info("AFTER decResourceRequest:" + " applicationId="
  + applicationId.getId() + " priority=" + priority.getPriority()
  + " resourceName=" + resourceName + " numContainers="
  + remoteRequest.getNumContainers() + " #asks=" + ask.size());
}
{code}
2. LeafQueue
{code}
if (LOG.isDebugEnabled()) {
  LOG.info(getQueueName() + 
  " user=" + userName + 
  " used=" + queueUsage.getUsed() + " numContainers=" + numContainers +
  " headroom = " + application.getHeadroom() +
  " user-resources=" + user.getUsed()
  );
}
{code}
3. ContainerTokenSelector
{code}
for (Token token : tokens) {
  if (LOG.isDebugEnabled()) {
LOG.info("Looking for service: " + service + ". Current token is "
+ token);
  }
  if (ContainerTokenIdentifier.KIND.equals(token.getKind()) && 
  service.equals(token.getService())) {
return (Token) token;
  }
}
{code}
and so on. Could you check his patch( 
https://issues.apache.org/jira/secure/attachment/12605179/HADOOP-9995.patch ) 
and update them also?

> Consistent log severity level guards and statements in RMContainerAllocator
> ---
>
> Key: MAPREDUCE-6468
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6468
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jackie Chang
>Assignee: Jagadesh Kiran N
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-9995-00.patch, HADOOP-9995.patch, 
> MAPREDUCE-6468-01.patch
>
>
> Developers use logs to do in-house debugging. These log statements are later 
> demoted to less severe levels and usually are guarded by their matching 
> severity levels. However, we do see inconsistencies in trunk. A log statement 
> like 
> {code}
>if (LOG.isDebugEnabled()) {
> LOG.info("Assigned container (" + allocated + ") "
> {code}
> doesn't make much sense because the log message is actually only printed out 
> in DEBUG-level. We do see previous issues tried to correct this 
> inconsistency. I am proposing a comprehensive correction over trunk.
> Doug Cutting pointed it out in HADOOP-312: 
> https://issues.apache.org/jira/browse/HADOOP-312?focusedCommentId=12429498=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12429498
> HDFS-1611 also corrected this inconsistency.
> This could have been avoided by switching from log4j to slf4j's {} format 
> like CASSANDRA-625 (2010/3) and ZOOKEEPER-850 (2012/1), which gives cleaner 
> code and slightly higher performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6468) Consistent log severity level guards and statements in RMContainerAllocator

2015-09-08 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated MAPREDUCE-6468:
--
Summary: Consistent log severity level guards and statements in 
RMContainerAllocator  (was: Consistent log severity level guards and statements 
)

> Consistent log severity level guards and statements in RMContainerAllocator
> ---
>
> Key: MAPREDUCE-6468
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6468
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jackie Chang
>Assignee: Jagadesh Kiran N
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-9995-00.patch, HADOOP-9995.patch, 
> MAPREDUCE-6468-01.patch
>
>
> Developers use logs to do in-house debugging. These log statements are later 
> demoted to less severe levels and usually are guarded by their matching 
> severity levels. However, we do see inconsistencies in trunk. A log statement 
> like 
> {code}
>if (LOG.isDebugEnabled()) {
> LOG.info("Assigned container (" + allocated + ") "
> {code}
> doesn't make much sense because the log message is actually only printed out 
> in DEBUG-level. We do see previous issues tried to correct this 
> inconsistency. I am proposing a comprehensive correction over trunk.
> Doug Cutting pointed it out in HADOOP-312: 
> https://issues.apache.org/jira/browse/HADOOP-312?focusedCommentId=12429498=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12429498
> HDFS-1611 also corrected this inconsistency.
> This could have been avoided by switching from log4j to slf4j's {} format 
> like CASSANDRA-625 (2010/3) and ZOOKEEPER-850 (2012/1), which gives cleaner 
> code and slightly higher performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6468) Consistent log severity level guards and statements in MapReduce project

2015-09-08 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated MAPREDUCE-6468:
--
Summary: Consistent log severity level guards and statements in MapReduce 
project  (was: Consistent log severity level guards and statements in 
RMContainerAllocator)

> Consistent log severity level guards and statements in MapReduce project
> 
>
> Key: MAPREDUCE-6468
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6468
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jackie Chang
>Assignee: Jagadesh Kiran N
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-9995-00.patch, HADOOP-9995.patch, 
> MAPREDUCE-6468-01.patch
>
>
> Developers use logs to do in-house debugging. These log statements are later 
> demoted to less severe levels and usually are guarded by their matching 
> severity levels. However, we do see inconsistencies in trunk. A log statement 
> like 
> {code}
>if (LOG.isDebugEnabled()) {
> LOG.info("Assigned container (" + allocated + ") "
> {code}
> doesn't make much sense because the log message is actually only printed out 
> in DEBUG-level. We do see previous issues tried to correct this 
> inconsistency. I am proposing a comprehensive correction over trunk.
> Doug Cutting pointed it out in HADOOP-312: 
> https://issues.apache.org/jira/browse/HADOOP-312?focusedCommentId=12429498=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12429498
> HDFS-1611 also corrected this inconsistency.
> This could have been avoided by switching from log4j to slf4j's {} format 
> like CASSANDRA-625 (2010/3) and ZOOKEEPER-850 (2012/1), which gives cleaner 
> code and slightly higher performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6471) Document distcp incremental copy

2015-09-08 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created MAPREDUCE-6471:


 Summary: Document distcp incremental copy 
 Key: MAPREDUCE-6471
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6471
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp
Affects Versions: 2.7.1
Reporter: Arpit Agarwal


MAPREDUCE-5899 added distcp support for incremental copy with a new {{append}} 
flag.

It should be documented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6472) MapReduce AM should have java.io.tmpdir=./tmp to be consistent with tasks

2015-09-08 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6472:
-

 Summary: MapReduce AM should have java.io.tmpdir=./tmp to be 
consistent with tasks
 Key: MAPREDUCE-6472
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6472
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.6.0
Reporter: Jason Lowe


MapReduceChildJVM.getVMCommand ensures that all tasks have 
-Djava.io.tmpdir=./tmp set as part of the task command-line, but this is only 
used for tasks.  The AM itself does not have a corresponding java.io.tmpdir 
setting.  It should also use the same tmpdir setting to avoid cases where the 
AM JVM wants to place files in /tmp by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6468) Consistent log severity level guards and statements in MapReduce project

2015-09-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735445#comment-14735445
 ] 

Hadoop QA commented on MAPREDUCE-6468:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  24m  0s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 59s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 16s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   3m 47s | The applied patch generated  1 
new checkstyle issues (total was 10, now 10). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   9m 32s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 18s | Pre-build of native portion |
| {color:green}+1{color} | mapreduce tests |   9m 15s | Tests passed in 
hadoop-mapreduce-client-app. |
| {color:green}+1{color} | tools/hadoop tests |   6m 18s | Tests passed in 
hadoop-distcp. |
| {color:green}+1{color} | tools/hadoop tests |  14m 50s | Tests passed in 
hadoop-gridmix. |
| {color:green}+1{color} | yarn tests |   7m  4s | Tests passed in 
hadoop-yarn-client. |
| {color:green}+1{color} | yarn tests |   2m  0s | Tests passed in 
hadoop-yarn-common. |
| {color:red}-1{color} | yarn tests |  54m 29s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| {color:red}-1{color} | hdfs tests | 162m 44s | Tests failed in hadoop-hdfs. |
| | | 319m 23s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12754635/MAPREDUCE-6468-02.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 090d266 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5972/artifact/patchprocess/diffcheckstylehadoop-distcp.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5972/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5972/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-mapreduce-client-app test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5972/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
 |
| hadoop-distcp test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5972/artifact/patchprocess/testrun_hadoop-distcp.txt
 |
| hadoop-gridmix test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5972/artifact/patchprocess/testrun_hadoop-gridmix.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5972/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5972/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5972/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5972/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5972/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5972/console |


This message was automatically generated.

> Consistent log severity level guards and statements in MapReduce project
> 
>
> Key: MAPREDUCE-6468
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6468
> 

[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-09-08 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5982:
--
Status: Open  (was: Patch Available)

Thanks for the patch, Chang!

Note that the point of this change is to be able to have users locate any 
potential logs for applications that failed in the ASSIGNED state.  By having a 
canned fake started event there's no way to determine which nodemanager tried 
to run the container and therefore we can't provide a good logs link.  We need 
to preserve as much information as we can about the task, and that includes the 
host, http port, etc.

The good news is that we have most of this information from the container that 
was assigned to the task attempt.  See the code for LaunchedContainerTransition 
for details.  It would be nice to see some of the code in that transition 
factored out so it can be reused when we are creating the start event for an 
attempt that failed in the ASSIGNED state.  Also I would hesitate to call it a 
fake event.  It's still a task started event, but we are missing just a few key 
components like the shuffle port and the start time.  If we factor out the code 
from LaunchedContainerTransition then we can drop the "fake" part.

Is forceFinishTime really necessary?  We can go ahead and set the launch time 
as we are processing the task started event and then just call setFinishTime.

In general I think we should worry about making sure we generate a proper task 
start event and then let the normal task unsuccessful completion event code 
handle things after that.  For example, in DeallocateContainerTransition I 
think we should be generating the job counter update events for this scenario, 
but we don't since we go down a different task unsuccessful completion event 
handling path when launchTime is zero.  Seems like we should just generate the 
missing start event when launchTime is zero then fall through to the normal 
unsucessful completion event handling code in all cases after that.

Nit: missing whitespace before new method in MRApp.


> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.7.1, 2.2.1, 0.23.10
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, 
> MAPREDUCE-5982.4.patch, MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6415) Create a tool to combine aggregated logs into HAR files

2015-09-08 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735558#comment-14735558
 ] 

Arun Suresh commented on MAPREDUCE-6415:


The latest patch looks good, 
+1, as long as [~jlowe] / [~kasha] has no other issues..

Thanks [~rkanter]

> Create a tool to combine aggregated logs into HAR files
> ---
>
> Key: MAPREDUCE-6415
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6415
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: HAR-ableAggregatedLogs_v1.pdf, MAPREDUCE-6415.001.patch, 
> MAPREDUCE-6415.002.patch, MAPREDUCE-6415.002.patch, 
> MAPREDUCE-6415_branch-2.001.patch, MAPREDUCE-6415_branch-2.002.patch, 
> MAPREDUCE-6415_branch-2_prelim_001.patch, 
> MAPREDUCE-6415_branch-2_prelim_002.patch, MAPREDUCE-6415_prelim_001.patch, 
> MAPREDUCE-6415_prelim_002.patch
>
>
> While we wait for YARN-2942 to become viable, it would still be great to 
> improve the aggregated logs problem.  We can write a tool that combines 
> aggregated log files into a single HAR file per application, which should 
> solve the too many files and too many blocks problems.  See the design 
> document for details.
> See YARN-2942 for more context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6415) Create a tool to combine aggregated logs into HAR files

2015-09-08 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735740#comment-14735740
 ] 

Karthik Kambatla commented on MAPREDUCE-6415:
-

The patch looks mostly good to me, but for the following nits:
# HadoopArchiveLogs constructor doesn't need type on HashSet in Java 7
# HadoopArchiveLogs#run returns -1. Could we return a positive value, say 1, 
instead?
# HadoopArchiveLogs#checkFiles has an unused variable

Once the nits are fixed, I think we should get this in. Let us work on avoiding 
concurrent runs and any other bugs we find in a follow-up JIRA? 

> Create a tool to combine aggregated logs into HAR files
> ---
>
> Key: MAPREDUCE-6415
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6415
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: HAR-ableAggregatedLogs_v1.pdf, MAPREDUCE-6415.001.patch, 
> MAPREDUCE-6415.002.patch, MAPREDUCE-6415.002.patch, 
> MAPREDUCE-6415_branch-2.001.patch, MAPREDUCE-6415_branch-2.002.patch, 
> MAPREDUCE-6415_branch-2_prelim_001.patch, 
> MAPREDUCE-6415_branch-2_prelim_002.patch, MAPREDUCE-6415_prelim_001.patch, 
> MAPREDUCE-6415_prelim_002.patch
>
>
> While we wait for YARN-2942 to become viable, it would still be great to 
> improve the aggregated logs problem.  We can write a tool that combines 
> aggregated log files into a single HAR file per application, which should 
> solve the too many files and too many blocks problems.  See the design 
> document for details.
> See YARN-2942 for more context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-09-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735595#comment-14735595
 ] 

Jason Lowe commented on MAPREDUCE-5002:
---

Thanks for the patch, Chang!

Main change of the patch looks good, just some issues with the test.  I'm not a 
fan of having the test reach into the bowels of a private variable in the class 
and modify it directly.  To me that's sort of an invalid setup.  Instead the 
test should be able to accomplish the task via normal interfaces, otherwise the 
reported bug doesn't exist.  In this case we should be able to send appropriate 
allocate responses to convince the original code to accidentally grant a reduce 
container to a map and see that the new code does not do this.  It may be 
simpler to mock up the AM protocol directly rather than using a MockRM to get 
it to grant the excess containers required.

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, 
> MAPREDUCE-5002.2.patch
>
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6468) Consistent log severity level guards and statements in MapReduce project

2015-09-08 Thread Jagadesh Kiran N (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jagadesh Kiran N updated MAPREDUCE-6468:

Attachment: MAPREDUCE-6468-02.patch

> Consistent log severity level guards and statements in MapReduce project
> 
>
> Key: MAPREDUCE-6468
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6468
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jackie Chang
>Assignee: Jagadesh Kiran N
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-9995-00.patch, HADOOP-9995.patch, 
> MAPREDUCE-6468-01.patch, MAPREDUCE-6468-02.patch
>
>
> Developers use logs to do in-house debugging. These log statements are later 
> demoted to less severe levels and usually are guarded by their matching 
> severity levels. However, we do see inconsistencies in trunk. A log statement 
> like 
> {code}
>if (LOG.isDebugEnabled()) {
> LOG.info("Assigned container (" + allocated + ") "
> {code}
> doesn't make much sense because the log message is actually only printed out 
> in DEBUG-level. We do see previous issues tried to correct this 
> inconsistency. I am proposing a comprehensive correction over trunk.
> Doug Cutting pointed it out in HADOOP-312: 
> https://issues.apache.org/jira/browse/HADOOP-312?focusedCommentId=12429498=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12429498
> HDFS-1611 also corrected this inconsistency.
> This could have been avoided by switching from log4j to slf4j's {} format 
> like CASSANDRA-625 (2010/3) and ZOOKEEPER-850 (2012/1), which gives cleaner 
> code and slightly higher performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6415) Create a tool to combine aggregated logs into HAR files

2015-09-08 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated MAPREDUCE-6415:
-
Attachment: MAPREDUCE-6415.003.patch

The 003 patch addresses the issues Karthik pointed out.  I agree that we can 
follow up with those other things in new JIRAs.

> Create a tool to combine aggregated logs into HAR files
> ---
>
> Key: MAPREDUCE-6415
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6415
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: HAR-ableAggregatedLogs_v1.pdf, MAPREDUCE-6415.001.patch, 
> MAPREDUCE-6415.002.patch, MAPREDUCE-6415.002.patch, MAPREDUCE-6415.003.patch, 
> MAPREDUCE-6415_branch-2.001.patch, MAPREDUCE-6415_branch-2.002.patch, 
> MAPREDUCE-6415_branch-2.003.patch, MAPREDUCE-6415_branch-2_prelim_001.patch, 
> MAPREDUCE-6415_branch-2_prelim_002.patch, MAPREDUCE-6415_prelim_001.patch, 
> MAPREDUCE-6415_prelim_002.patch
>
>
> While we wait for YARN-2942 to become viable, it would still be great to 
> improve the aggregated logs problem.  We can write a tool that combines 
> aggregated log files into a single HAR file per application, which should 
> solve the too many files and too many blocks problems.  See the design 
> document for details.
> See YARN-2942 for more context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6415) Create a tool to combine aggregated logs into HAR files

2015-09-08 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated MAPREDUCE-6415:
-
Attachment: MAPREDUCE-6415_branch-2.003.patch

> Create a tool to combine aggregated logs into HAR files
> ---
>
> Key: MAPREDUCE-6415
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6415
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: HAR-ableAggregatedLogs_v1.pdf, MAPREDUCE-6415.001.patch, 
> MAPREDUCE-6415.002.patch, MAPREDUCE-6415.002.patch, 
> MAPREDUCE-6415_branch-2.001.patch, MAPREDUCE-6415_branch-2.002.patch, 
> MAPREDUCE-6415_branch-2.003.patch, MAPREDUCE-6415_branch-2_prelim_001.patch, 
> MAPREDUCE-6415_branch-2_prelim_002.patch, MAPREDUCE-6415_prelim_001.patch, 
> MAPREDUCE-6415_prelim_002.patch
>
>
> While we wait for YARN-2942 to become viable, it would still be great to 
> improve the aggregated logs problem.  We can write a tool that combines 
> aggregated log files into a single HAR file per application, which should 
> solve the too many files and too many blocks problems.  See the design 
> document for details.
> See YARN-2942 for more context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6415) Create a tool to combine aggregated logs into HAR files

2015-09-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735879#comment-14735879
 ] 

Hadoop QA commented on MAPREDUCE-6415:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 34s | Findbugs (version 3.0.0) 
appears to be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 44s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  6s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 38s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | shellcheck |   0m  5s | There were no new shellcheck 
(v0.3.3) issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   0m 12s | Post-patch findbugs 
hadoop-assemblies compilation is broken. |
| {color:red}-1{color} | findbugs |   0m 24s | Post-patch findbugs 
hadoop-tools/hadoop-tools-dist compilation is broken. |
| {color:green}+1{color} | findbugs |   0m 24s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | assemblies tests |   0m 10s | Tests passed in 
hadoop-assemblies. |
| {color:green}+1{color} | tools/hadoop tests |   0m 13s | Tests passed in 
hadoop-tools-dist. |
| | |  37m 25s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12754769/MAPREDUCE-6415.003.patch
 |
| Optional Tests | javadoc javac unit shellcheck findbugs checkstyle |
| git revision | trunk / d9c1fab |
| hadoop-assemblies test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5974/artifact/patchprocess/testrun_hadoop-assemblies.txt
 |
| hadoop-tools-dist test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5974/artifact/patchprocess/testrun_hadoop-tools-dist.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5974/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5974/console |


This message was automatically generated.

> Create a tool to combine aggregated logs into HAR files
> ---
>
> Key: MAPREDUCE-6415
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6415
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: HAR-ableAggregatedLogs_v1.pdf, MAPREDUCE-6415.001.patch, 
> MAPREDUCE-6415.002.patch, MAPREDUCE-6415.002.patch, MAPREDUCE-6415.003.patch, 
> MAPREDUCE-6415_branch-2.001.patch, MAPREDUCE-6415_branch-2.002.patch, 
> MAPREDUCE-6415_branch-2.003.patch, MAPREDUCE-6415_branch-2_prelim_001.patch, 
> MAPREDUCE-6415_branch-2_prelim_002.patch, MAPREDUCE-6415_prelim_001.patch, 
> MAPREDUCE-6415_prelim_002.patch
>
>
> While we wait for YARN-2942 to become viable, it would still be great to 
> improve the aggregated logs problem.  We can write a tool that combines 
> aggregated log files into a single HAR file per application, which should 
> solve the too many files and too many blocks problems.  See the design 
> document for details.
> See YARN-2942 for more context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MAPREDUCE-6472) MapReduce AM should have java.io.tmpdir=./tmp to be consistent with tasks

2015-09-08 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned MAPREDUCE-6472:


Assignee: Naganarasimha G R

> MapReduce AM should have java.io.tmpdir=./tmp to be consistent with tasks
> -
>
> Key: MAPREDUCE-6472
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6472
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Naganarasimha G R
>
> MapReduceChildJVM.getVMCommand ensures that all tasks have 
> -Djava.io.tmpdir=./tmp set as part of the task command-line, but this is only 
> used for tasks.  The AM itself does not have a corresponding java.io.tmpdir 
> setting.  It should also use the same tmpdir setting to avoid cases where the 
> AM JVM wants to place files in /tmp by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5870) Support for passing Job priority through Application Submission Context in Mapreduce Side

2015-09-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735105#comment-14735105
 ] 

Jason Lowe commented on MAPREDUCE-5870:
---

Thanks for updating the patch!

We cannot change ClientProtocol.setJobPriority to take an integer rather than a 
String.  That's a backwards-incompatible change.

Why are we adding ClientProtocol.getJobPriority?  We are already returning the 
job priority via getJobStatus, so I'm a little confused on why we are extending 
the protocol to get information the client can already retrieve.

Does it make more sense to have LocalJobRunner.getJobPriority return a normal 
or default priority?  Returning null seems disruptive, as priorities really 
don't matter in the local job mode scenario.  In a similar sense, I'm wondering 
if we should have LocalJobRunner.setJobPriority simply track or ignore the 
priority rather than explode.  Since it already exploded for setJobPriority 
before we can tackle that in a separate JIRA if desired.

Why does JobConf.setPriorityAsInteger take a boxed type rather than a normal 
int and call Integer.toString?

I think the new TypeConverter functions are a little too generically named.  
It's a bit dangerous to have a toYarn that takes something as generic as a 
String and returns an int -- it could easily me misused in other contexts that 
have nothing to do with job priorities.  I'd name these fromYarnPriority, 
toYarnPriority, etc. so there's no chance of accidental collision.

convertPriorityToInteger needs to handle the case where the string is itself an 
integer, otherwise setJobPriorityAsInteger followed by getJobPriorityAsInteger 
doesn't work as expected.  Trying to do JobPriority.valueOf on a string that 
may not contain a valid enum is going to throw an IllegalArgumentException.  
Please add unit tests for the various get/set as integer scenarios on both Job 
and JobConf to make sure we're handling them properly.

Why was the call to testChangingJobPriority removed from TestMRJobClient?  The 
function is still there, although in practice is never called.


> Support for passing Job priority through Application Submission Context in 
> Mapreduce Side
> -
>
> Key: MAPREDUCE-5870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-MAPREDUCE-5870.patch, 0002-MAPREDUCE-5870.patch, 
> 0003-MAPREDUCE-5870.patch, 0004-MAPREDUCE-5870.patch, 
> 0005-MAPREDUCE-5870.patch, 0006-MAPREDUCE-5870.patch, Yarn-2002.1.patch
>
>
> Job Prioirty can be set from client side as below [Configuration and api].
>   a.  JobConf.getJobPriority() and 
> Job.setPriority(JobPriority priority) 
>   b.  We can also use configuration 
> "mapreduce.job.priority".
>   Now this Job priority can be passed in Application Submission 
> context from Client side.
>   Here we can reuse the MRJobConfig.PRIORITY configuration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6468) Consistent log severity level guards and statements in MapReduce project

2015-09-08 Thread Jagadesh Kiran N (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734959#comment-14734959
 ] 

Jagadesh Kiran N commented on MAPREDUCE-6468:
-

thanks [~ozawa] for your review, I have updated the patch, Please review  , no 
change required for ProtobufRpcEngine.java file .

> Consistent log severity level guards and statements in MapReduce project
> 
>
> Key: MAPREDUCE-6468
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6468
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jackie Chang
>Assignee: Jagadesh Kiran N
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-9995-00.patch, HADOOP-9995.patch, 
> MAPREDUCE-6468-01.patch, MAPREDUCE-6468-02.patch
>
>
> Developers use logs to do in-house debugging. These log statements are later 
> demoted to less severe levels and usually are guarded by their matching 
> severity levels. However, we do see inconsistencies in trunk. A log statement 
> like 
> {code}
>if (LOG.isDebugEnabled()) {
> LOG.info("Assigned container (" + allocated + ") "
> {code}
> doesn't make much sense because the log message is actually only printed out 
> in DEBUG-level. We do see previous issues tried to correct this 
> inconsistency. I am proposing a comprehensive correction over trunk.
> Doug Cutting pointed it out in HADOOP-312: 
> https://issues.apache.org/jira/browse/HADOOP-312?focusedCommentId=12429498=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12429498
> HDFS-1611 also corrected this inconsistency.
> This could have been avoided by switching from log4j to slf4j's {} format 
> like CASSANDRA-625 (2010/3) and ZOOKEEPER-850 (2012/1), which gives cleaner 
> code and slightly higher performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6302) deadlock in a job between map and reduce cores allocation

2015-09-08 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-6302:

Status: Open  (was: Patch Available)

> deadlock in a job between map and reduce cores allocation 
> --
>
> Key: MAPREDUCE-6302
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6302
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: mai shurong
>Assignee: Anubhav Dhoot
>Priority: Critical
> Attachments: AM_log_head10.txt.gz, AM_log_tail10.txt.gz, 
> log.txt, mr-6302-prelim.patch, queue_with_max163cores.png, 
> queue_with_max263cores.png, queue_with_max333cores.png
>
>
> I submit a  big job, which has 500 maps and 350 reduce, to a 
> queue(fairscheduler) with 300 max cores. When the big mapreduce job is 
> running 100% maps, the 300 reduces have occupied 300 max cores in the queue. 
> And then, a map fails and retry, waiting for a core, while the 300 reduces 
> are waiting for failed map to finish. So a deadlock occur. As a result, the 
> job is blocked, and the later job in the queue cannot run because no 
> available cores in the queue.
> I think there is the similar issue for memory of a queue .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)