[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-04-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249127#comment-15249127
 ] 

Hadoop QA commented on MAPREDUCE-6657:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 49s 
{color} | {color:green} 
hadoop-mapreduce-project_hadoop-mapreduce-client-jdk1.8.0_77 with JDK v1.8.0_77 
generated 0 new + 356 unchanged - 6 fixed = 356 total (was 362) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 40s 
{color} | {color:green} hadoop-mapreduce-client in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 42s 
{color} | {color:green} 
hadoop-mapreduce-project_hadoop-mapreduce-client-jdk1.7.0_95 with JDK v1.7.0_95 
generated 0 new + 361 unchanged - 6 fixed = 361 total (was 367) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 53s 
{color} | {color:green} hadoop-mapreduce-client in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 42s 
{color} | {color:green} hadoop-mapreduce-client-app in the patch passed with 
JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 108m 25s 
{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed 
with JDK v1.8.0_77. {color} |
| 

[jira] [Updated] (MAPREDUCE-6683) Execute hadoop 1.0.1 application in hadoop 2.6.0 cause Output directory not set execption

2016-04-19 Thread Han Gao (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Han Gao updated MAPREDUCE-6683:
---
Summary: Execute hadoop 1.0.1 application in hadoop 2.6.0 cause Output 
directory not set execption  (was: Execute hadoop 1.0.1 application in hadoop 
2.6.0)

> Execute hadoop 1.0.1 application in hadoop 2.6.0 cause Output directory not 
> set execption
> -
>
> Key: MAPREDUCE-6683
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6683
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
> Environment: Linux Ubuntu 12.04, hadoop 2.6.0 
>Reporter: Han Gao
>Priority: Minor
>
> The application can run normally in Hadoop 1.0.1 but can't run in 2.6.0 even 
> though adapt to use new mapreduce API. 
> org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set.
>   at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:128)
>   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:889)
>   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>   at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
>   at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
>   at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:233)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6683) Execute hadoop 1.0.1 application in hadoop 2.6.0

2016-04-19 Thread Han Gao (JIRA)
Han Gao created MAPREDUCE-6683:
--

 Summary: Execute hadoop 1.0.1 application in hadoop 2.6.0
 Key: MAPREDUCE-6683
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6683
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
 Environment: Linux Ubuntu 12.04, hadoop 2.6.0 
Reporter: Han Gao
Priority: Minor


The application can run normally in Hadoop 1.0.1 but can't run in 2.6.0 even 
though adapt to use new mapreduce API. 

org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set.
at 
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:128)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:889)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
at 
org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
at 
org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:233)
at java.lang.Thread.run(Thread.java:745)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-04-19 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248625#comment-15248625
 ] 

Haibo Chen commented on MAPREDUCE-6657:
---

updated the test method according to [~templedf]'s comments, and moved it to a 
new test class because it cannot share clusters with other test methods in 
TestHistoryFileManager.

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6677.003.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-04-19 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Attachment: mapreduce6677.003.patch

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6677.003.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MAPREDUCE-6608) Work Preserving AM Restart for MapReduce

2016-04-19 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248151#comment-15248151
 ] 

Junping Du edited comment on MAPREDUCE-6608 at 4/19/16 4:46 PM:


[~vinodkv], thanks for review and comments. I think most your points here are 
solid, however, the comments about "Output Commit of previous tasks" is a bit 
stale.

bq. The new AM needs to make sure that output of previously running containers 
can be safely committed. IIRC, with today's FileOutputCommitter, new AM will 
only promote task-outputs that are present in 
$jobOutput/_temporary/$currentAttemptID/
This is true before YARN-4815. However, after YARN-4815, most task-output 
commit to job final output is handled by {{FileOutputCommitter.commitTask()}} 
instead of {{FileOutputCommitter.commitJob()}}. So the commitJob() only left 
work of cleanup $jobOutput/_temporary. So there is nothing need to do here 
except we make sure "mapreduce.fileoutputcommitter.algorithm.version" is set to 
2. 
This is also an assumption setting for work of MAPREDUCE-5485 which is a 
prerequisite for feature here - or AM will failed directly in case previous AM 
ends in job committing.

Investigating on rest of issues and will bring some possible proposals later.  


bq. I'd suggest spending more time on the design, atleast on some of the areas 
I pointed above and then create a branch, create sub-tasks, do some prototypes 
etc.
+1. This feature work could be a bit over my expectation before. I agree we 
could need a separated branch for developing this in parallel. Will create a 
branch once we finalize our design work. 



was (Author: djp):
[~vinodkv], thanks for review and comments. I think most your points here are 
solid, however, the comments about "Output Commit of previous tasks" is a bit 
stale.

bq. The new AM needs to make sure that output of previously running containers 
can be safely committed. IIRC, with today's FileOutputCommitter, new AM will 
only promote task-outputs that are present in 
$jobOutput/_temporary/$currentAttemptID/
This is true before YARN-4815. However, after YARN-4815, most task-output 
commit to job final output is handled by {{FileOutputCommitter.commitTask()}} 
instead of {{FileOutputCommitter.commitJob()}}. So the commitJob() only left 
work of cleanup $jobOutput/_temporary. So there is nothing need to do here 
unless we make sure "mapreduce.fileoutputcommitter.algorithm.version" is set to 
2. 
This is also an assumption setting for work of MAPREDUCE-5485 which is a 
prerequisite for feature here - or AM will failed directly in case previous AM 
ends in job committing.

Investigating on rest of issues and will propose some possible solutions later. 
 


bq. I'd suggest spending more time on the design, atleast on some of the areas 
I pointed above and then create a branch, create sub-tasks, do some prototypes 
etc.
+1. This feature work could be a bit over my expectation before. I agree we 
could need a separated branch for developing this in parallel. Will create a 
branch once we finalize our design work. 


> Work Preserving AM Restart for MapReduce
> 
>
> Key: MAPREDUCE-6608
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6608
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Srikanth Sampath
>Assignee: Srikanth Sampath
> Attachments: Patch1.patch, WorkPreservingMRAppMaster-1.pdf, 
> WorkPreservingMRAppMaster-2.pdf, WorkPreservingMRAppMaster.pdf
>
>
> Providing a framework for work preserving AM is achieved in 
> [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489].  We would like 
> to take advantage of this for MapReduce(MR) applications.  There are some 
> challenges which have been described in the attached document and few options 
> discussed.  We solicit feedback from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6608) Work Preserving AM Restart for MapReduce

2016-04-19 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248151#comment-15248151
 ] 

Junping Du commented on MAPREDUCE-6608:
---

[~vinodkv], thanks for review and comments. I think most your points here are 
solid, however, the comments about "Output Commit of previous tasks" is a bit 
stale.

bq. The new AM needs to make sure that output of previously running containers 
can be safely committed. IIRC, with today's FileOutputCommitter, new AM will 
only promote task-outputs that are present in 
$jobOutput/_temporary/$currentAttemptID/
This is true before YARN-4815. However, after YARN-4815, most task-output 
commit to job final output is handled by {{FileOutputCommitter.commitTask()}} 
instead of {{FileOutputCommitter.commitJob()}}. So the commitJob() only left 
work of cleanup $jobOutput/_temporary. So there is nothing need to do here 
unless we make sure "mapreduce.fileoutputcommitter.algorithm.version" is set to 
2. 
This is also an assumption setting for work of MAPREDUCE-5485 which is a 
prerequisite for feature here - or AM will failed directly in case previous AM 
ends in job committing.

Investigating on rest of issues and will propose some possible solutions later. 
 


bq. I'd suggest spending more time on the design, atleast on some of the areas 
I pointed above and then create a branch, create sub-tasks, do some prototypes 
etc.
+1. This feature work could be a bit over my expectation before. I agree we 
could need a separated branch for developing this in parallel. Will create a 
branch once we finalize our design work. 


> Work Preserving AM Restart for MapReduce
> 
>
> Key: MAPREDUCE-6608
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6608
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Srikanth Sampath
>Assignee: Srikanth Sampath
> Attachments: Patch1.patch, WorkPreservingMRAppMaster-1.pdf, 
> WorkPreservingMRAppMaster-2.pdf, WorkPreservingMRAppMaster.pdf
>
>
> Providing a framework for work preserving AM is achieved in 
> [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489].  We would like 
> to take advantage of this for MapReduce(MR) applications.  There are some 
> challenges which have been described in the attached document and few options 
> discussed.  We solicit feedback from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5817) Mappers get rescheduled on node transition even after all reducers are completed

2016-04-19 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated MAPREDUCE-5817:
--
Fix Version/s: (was: 2.8.0)
   2.7.3

> Mappers get rescheduled on node transition even after all reducers are 
> completed
> 
>
> Key: MAPREDUCE-5817
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5817
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.3.0
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Fix For: 2.7.3
>
> Attachments: MAPREDUCE-5817.001.patch, MAPREDUCE-5817.002.patch, 
> mapreduce-5817.patch
>
>
> We're seeing a behavior where a job runs long after all reducers were already 
> finished. We found that the job was rescheduling and running a number of 
> mappers beyond the point of reducer completion. In one situation, the job ran 
> for some 9 more hours after all reducers completed!
> This happens because whenever a node transition (to an unusable state) comes 
> into the app master, it just reschedules all mappers that already ran on the 
> node in all cases.
> Therefore, if any node transition has a potential to extend the job period. 
> Once this window opens, another node transition can prolong it, and this can 
> happen indefinitely in theory.
> If there is some instability in the pool (unhealthy, etc.) for a duration, 
> then any big job is severely vulnerable to this problem.
> If all reducers have been completed, JobImpl.actOnUnusableNode() should not 
> reschedule mapper tasks. If all reducers are completed, the mapper outputs 
> are no longer needed, and there is no need to reschedule mapper tasks as they 
> would not be consumed anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5817) Mappers get rescheduled on node transition even after all reducers are completed

2016-04-19 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248114#comment-15248114
 ] 

Wangda Tan commented on MAPREDUCE-5817:
---

Updated fix version.

> Mappers get rescheduled on node transition even after all reducers are 
> completed
> 
>
> Key: MAPREDUCE-5817
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5817
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.3.0
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Fix For: 2.7.3
>
> Attachments: MAPREDUCE-5817.001.patch, MAPREDUCE-5817.002.patch, 
> mapreduce-5817.patch
>
>
> We're seeing a behavior where a job runs long after all reducers were already 
> finished. We found that the job was rescheduling and running a number of 
> mappers beyond the point of reducer completion. In one situation, the job ran 
> for some 9 more hours after all reducers completed!
> This happens because whenever a node transition (to an unusable state) comes 
> into the app master, it just reschedules all mappers that already ran on the 
> node in all cases.
> Therefore, if any node transition has a potential to extend the job period. 
> Once this window opens, another node transition can prolong it, and this can 
> happen indefinitely in theory.
> If there is some instability in the pool (unhealthy, etc.) for a duration, 
> then any big job is severely vulnerable to this problem.
> If all reducers have been completed, JobImpl.actOnUnusableNode() should not 
> reschedule mapper tasks. If all reducers are completed, the mapper outputs 
> are no longer needed, and there is no need to reschedule mapper tasks as they 
> would not be consumed anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6680) JHS UserLogDir scan algorithm sometime could skip directory with update in CloudFS (Azure FileSystem, S3, etc.)

2016-04-19 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-6680:
--
Labels: Azure S3  (was: )

> JHS UserLogDir scan algorithm sometime could skip directory with update in 
> CloudFS (Azure FileSystem, S3, etc.)
> ---
>
> Key: MAPREDUCE-6680
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6680
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>  Labels: Azure, S3
> Attachments: MAPREDUCE-6680-v2.patch, MAPREDUCE-6680-v3.patch, 
> MAPREDUCE-6680.patch
>
>
> In our cluster based on a Cloud FileSystem, we notice JHS sometimes could 
> skip directory with .jhist file in scanning.
> The behavior is like:
> First round scan, doesn't found .jhist file:
> {noformat}
> 16/04/13 11:14:34 DEBUG azure.NativeAzureFileSystem: Found path as a 
> directory with 6 files in it.
> 16/04/13 11:14:34 DEBUG hs.HistoryFileManager: Found 0 files
> ...
> {noformat}
> Then, we see "Scan not needed of ..." for the same directory every 3 minutes 
> until application failed as timeout.
> From our analysis, we found the root cause is: most of Cloud File System 
> (Azure FS, S3, etc.) is truncating file/directory modification time to 
> seconds instead of milliseconds - which could due to limit of http protocol 
> (from discussion at: 
> https://forums.aws.amazon.com/thread.jspa?messageID=476615). 
> So if the time sequence is happen to be: latest non .jhist file modification 
> on directory happens at T1, directory scanning happens at T2, .jhist file 
> added to directory at T3. If we have {{T1< T2 < T3}} and T1 is equal to T3 
> after truncating to seconds, this issue could appear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-04-19 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247698#comment-15247698
 ] 

Daniel Templeton commented on MAPREDUCE-6657:
-

The message says that the server should have timed out, but the assert is 
testing whether the exception message is correct when it does time out.  If it 
doesn't time out, it looks to me like the test will pass.  You should probably 
also have an {{Assert.fail()}} after the {{serviceInit()}} call.

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-04-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247263#comment-15247263
 ] 

Hadoop QA commented on MAPREDUCE-6513:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 8m 37s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
5s {color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} branch-2.7 passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} branch-2.7 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
41s {color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s 
{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
50s {color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} branch-2.7 passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} branch-2.7 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 24s {color} 
| {color:red} 
hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app-jdk1.8.0_77
 with JDK v1.8.0_77 generated 1 new + 84 unchanged - 0 fixed = 85 total (was 
84) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 42s {color} 
| {color:red} 
hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app-jdk1.7.0_95
 with JDK v1.7.0_95 generated 1 new + 85 unchanged - 0 fixed = 86 total (was 
85) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 35s 
{color} | {color:red} 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: 
patch generated 33 new + 1673 unchanged - 2 fixed = 1706 total (was 1675) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2871 line(s) that end in whitespace. Use 
git apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 1m 11s 
{color} | {color:red} The patch has 303 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 5s {color} | 
{color:red} hadoop-mapreduce-client-app in the patch failed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 46s 
{color} | {color:green} hadoop-mapreduce-client-app in the patch passed with 
JDK v1.7.0_95. {color} |
|