date:20130911


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764201#comment-13764201
 ] 

Hudson commented on MAPREDUCE-5497:
---

SUCCESS: Integrated in Hadoop-Yarn-trunk #329 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/329/])
MAPREDUCE-5497. Changed MRAppMaster to sleep only after doing everything else 
but just before ClientService to avoid race conditions during RM restart. 
Contributed by Jian He. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521699)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/ClientService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/MRClientService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRAppComponentDependencies.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java


 '5s sleep'  in MRAppMaster.shutDownJob is only needed before stopping 
 ClientService
 ---

 Key: MAPREDUCE-5497
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5497
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.1.1-beta

 Attachments: MAPREDUCE-5497.1.patch, MAPREDUCE-5497.1.patch, 
 MAPREDUCE-5497.2.patch, MAPREDUCE-5497.patch


 Since the '5s sleep' is for the purpose to let clients know the final states, 
 put it after other services are stopped and only before stopping 
 ClientService is enough. This can reduce some race conditions like 
 MAPREDUCE-5471

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5020) Compile failure with JDK8


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764200#comment-13764200
 ] 

Hudson commented on MAPREDUCE-5020:
---

SUCCESS: Integrated in Hadoop-Yarn-trunk #329 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/329/])
MAPREDUCE-5020. Compile failure with JDK8 (Trevor Robinson via tgraves) 
(tgraves: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521576)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java


 Compile failure with JDK8
 -

 Key: MAPREDUCE-5020
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5020
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.3-alpha, 2.1.0-beta
 Environment: java version 1.8.0-ea
 Java(TM) SE Runtime Environment (build 1.8.0-ea-b36e)
 Java HotSpot(TM) Client VM (build 25.0-b04, mixed mode)
Reporter: Trevor Robinson
Assignee: Trevor Robinson
  Labels: build-failure, jdk8
 Fix For: 3.0.0, 2.3.0, 2.1.1-beta

 Attachments: MAPREDUCE-5020.patch


 Compiling {{org/apache/hadoop/mapreduce/lib/partition/InputSampler.java}} 
 fails with the Java 8 preview compiler due to its stricter enforcement of JLS 
 15.12.2.6 (for [Java 
 5|http://docs.oracle.com/javase/specs/jls/se5.0/html/expressions.html#15.12.2.6]
  or [Java 
 7|http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.12.2.6]),
  which demands that methods applicable via unchecked conversion have their 
 return type erased:
 {noformat}
 [ERROR] 
 hadoop-common/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java:[320,35]
  error: incompatible types: Object[] cannot be converted to K[]
 {noformat}
 {code}
   @SuppressWarnings(unchecked) // getInputFormat, getOutputKeyComparator
   public static K,V void writePartitionFile(Job job, SamplerK,V sampler) 
   throws IOException, ClassNotFoundException, InterruptedException {
 Configuration conf = job.getConfiguration();
 final InputFormat inf = 
 ReflectionUtils.newInstance(job.getInputFormatClass(), conf);
 int numPartitions = job.getNumReduceTasks();
 K[] samples = sampler.getSample(inf, job); // returns Object[] according 
 to JLS
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5485) Allow repeating job commit by extending OutputCommitter API

2013-09-11 Thread Nemon Lou (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764207#comment-13764207
 ] 

Nemon Lou commented on MAPREDUCE-5485:
--

Some initial thoughts: 
1,Adding a method : boolean isCommitJobRepeatable() for outputCommitters.
The abstract class OutputCommitter will return false for it.
And FileOutputCommitter returns true,as FileOutputCommitter's commitJob will 
remove exists files that will be committed and then do the rename.

For commit repeatable jobs,we have 2,3,4:
2,When commitJob method throws exception,AM will retry commit directly with a 
times limit. 
3,When AM has an error during committing(error is not from the commitJob 
method),it will not reach a job final state ,but just exit and leave work to 
another AM.
4,For secondly started AM,it will check the phase the job has reached.If the 
phase is commit failed,then it's state will reach job committing after recovery 
and start commit again.




 Allow repeating job commit by extending OutputCommitter API
 ---

 Key: MAPREDUCE-5485
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5485
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.1.0-beta
Reporter: Nemon Lou
Assignee: Nemon Lou

 There are chances MRAppMaster crush during job committing,or NodeManager 
 restart cause the committing AM exit due to container expire.In these cases 
 ,the job will fail.
 However,some jobs can redo commit so failing the job becomes unnecessary.
 Let clients tell AM to allow redo commit or not is a better choice.
 This idea comes from Jason Lowe's comments in MAPREDUCE-4819 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5020) Compile failure with JDK8


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764291#comment-13764291
 ] 

Hudson commented on MAPREDUCE-5020:
---

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1519 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1519/])
MAPREDUCE-5020. Compile failure with JDK8 (Trevor Robinson via tgraves) 
(tgraves: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521576)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java


 Compile failure with JDK8
 -

 Key: MAPREDUCE-5020
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5020
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.3-alpha, 2.1.0-beta
 Environment: java version 1.8.0-ea
 Java(TM) SE Runtime Environment (build 1.8.0-ea-b36e)
 Java HotSpot(TM) Client VM (build 25.0-b04, mixed mode)
Reporter: Trevor Robinson
Assignee: Trevor Robinson
  Labels: build-failure, jdk8
 Fix For: 3.0.0, 2.3.0, 2.1.1-beta

 Attachments: MAPREDUCE-5020.patch


 Compiling {{org/apache/hadoop/mapreduce/lib/partition/InputSampler.java}} 
 fails with the Java 8 preview compiler due to its stricter enforcement of JLS 
 15.12.2.6 (for [Java 
 5|http://docs.oracle.com/javase/specs/jls/se5.0/html/expressions.html#15.12.2.6]
  or [Java 
 7|http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.12.2.6]),
  which demands that methods applicable via unchecked conversion have their 
 return type erased:
 {noformat}
 [ERROR] 
 hadoop-common/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java:[320,35]
  error: incompatible types: Object[] cannot be converted to K[]
 {noformat}
 {code}
   @SuppressWarnings(unchecked) // getInputFormat, getOutputKeyComparator
   public static K,V void writePartitionFile(Job job, SamplerK,V sampler) 
   throws IOException, ClassNotFoundException, InterruptedException {
 Configuration conf = job.getConfiguration();
 final InputFormat inf = 
 ReflectionUtils.newInstance(job.getInputFormatClass(), conf);
 int numPartitions = job.getNumReduceTasks();
 K[] samples = sampler.getSample(inf, job); // returns Object[] according 
 to JLS
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5497) '5s sleep' in MRAppMaster.shutDownJob is only needed before stopping ClientService


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764292#comment-13764292
 ] 

Hudson commented on MAPREDUCE-5497:
---

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1519 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1519/])
MAPREDUCE-5497. Changed MRAppMaster to sleep only after doing everything else 
but just before ClientService to avoid race conditions during RM restart. 
Contributed by Jian He. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521699)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/ClientService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/MRClientService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRAppComponentDependencies.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java


 '5s sleep'  in MRAppMaster.shutDownJob is only needed before stopping 
 ClientService
 ---

 Key: MAPREDUCE-5497
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5497
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.1.1-beta

 Attachments: MAPREDUCE-5497.1.patch, MAPREDUCE-5497.1.patch, 
 MAPREDUCE-5497.2.patch, MAPREDUCE-5497.patch


 Since the '5s sleep' is for the purpose to let clients know the final states, 
 put it after other services are stopped and only before stopping 
 ClientService is enough. This can reduce some race conditions like 
 MAPREDUCE-5471

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5020) Compile failure with JDK8


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764353#comment-13764353
 ] 

Hudson commented on MAPREDUCE-5020:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1545 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1545/])
MAPREDUCE-5020. Compile failure with JDK8 (Trevor Robinson via tgraves) 
(tgraves: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521576)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java


 Compile failure with JDK8
 -

 Key: MAPREDUCE-5020
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5020
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.3-alpha, 2.1.0-beta
 Environment: java version 1.8.0-ea
 Java(TM) SE Runtime Environment (build 1.8.0-ea-b36e)
 Java HotSpot(TM) Client VM (build 25.0-b04, mixed mode)
Reporter: Trevor Robinson
Assignee: Trevor Robinson
  Labels: build-failure, jdk8
 Fix For: 3.0.0, 2.3.0, 2.1.1-beta

 Attachments: MAPREDUCE-5020.patch


 Compiling {{org/apache/hadoop/mapreduce/lib/partition/InputSampler.java}} 
 fails with the Java 8 preview compiler due to its stricter enforcement of JLS 
 15.12.2.6 (for [Java 
 5|http://docs.oracle.com/javase/specs/jls/se5.0/html/expressions.html#15.12.2.6]
  or [Java 
 7|http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.12.2.6]),
  which demands that methods applicable via unchecked conversion have their 
 return type erased:
 {noformat}
 [ERROR] 
 hadoop-common/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java:[320,35]
  error: incompatible types: Object[] cannot be converted to K[]
 {noformat}
 {code}
   @SuppressWarnings(unchecked) // getInputFormat, getOutputKeyComparator
   public static K,V void writePartitionFile(Job job, SamplerK,V sampler) 
   throws IOException, ClassNotFoundException, InterruptedException {
 Configuration conf = job.getConfiguration();
 final InputFormat inf = 
 ReflectionUtils.newInstance(job.getInputFormatClass(), conf);
 int numPartitions = job.getNumReduceTasks();
 K[] samples = sampler.getSample(inf, job); // returns Object[] according 
 to JLS
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5497) '5s sleep' in MRAppMaster.shutDownJob is only needed before stopping ClientService


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764354#comment-13764354
 ] 

Hudson commented on MAPREDUCE-5497:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1545 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1545/])
MAPREDUCE-5497. Changed MRAppMaster to sleep only after doing everything else 
but just before ClientService to avoid race conditions during RM restart. 
Contributed by Jian He. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521699)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/ClientService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/MRClientService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRAppComponentDependencies.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java


 '5s sleep'  in MRAppMaster.shutDownJob is only needed before stopping 
 ClientService
 ---

 Key: MAPREDUCE-5497
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5497
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.1.1-beta

 Attachments: MAPREDUCE-5497.1.patch, MAPREDUCE-5497.1.patch, 
 MAPREDUCE-5497.2.patch, MAPREDUCE-5497.patch


 Since the '5s sleep' is for the purpose to let clients know the final states, 
 put it after other services are stopped and only before stopping 
 ClientService is enough. This can reduce some race conditions like 
 MAPREDUCE-5471

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5332) Support token-preserving restart of history server

2013-09-11 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764384#comment-13764384
 ] 

Daryn Sharp commented on MAPREDUCE-5332:


+HistoryServerFileSystemStateStore+
# Suggest: It may be clearer to rename the {{TOKEN_FOO_PREFIX}} constants to be 
{{TOKEN_FOO_DIR_PREFIX}} or {{TOKEN_*_FILE_PREFIX}}.
# Suggest: I'd consider not having the hardcoded {{ROOT_STATE_DIR_NAME}} added 
to the user's path configured by {{MR_HS_FS_STATE_STORE_URI}}.  Is there an 
advantage to not using exactly what the user specified?  Up to you.
# Question: In {{startStateStorage}}, why 2 mkdirs instead of 1?  A mkdir via 
{{createDir}} is only going to check the permissions of the leaf dir which 
seems dubious.  If any of the parent dirs are owned by another user with open 
permissions, the directory created by the JHS can be deleted and recreated with 
open permissions.  Point is I'm not sure the extra checks add value, but I 
suppose they don't hurt.  Up to you. 
# Bug: Unlike {{HistoryServerMemStateStore}}, there appear to be no checks for 
things being added twice - although arguably those checks all belong in ADTSM.  
Token check is there, but not a secret check.  I think the state stores should 
behave consistently.
# Bug: In {{getBucketPath}}, I think you want to mod (%) the seq number instead 
of dividing?  Otherwise it creates a janitorial job for someone to clean up 
empty directories.

+HistoryServerStateStore+
# Suggest: For clarify, perhaps rename to {{HistoryServerStateStore*Service*}}. 
 I kept getting it confused with {{HistoryServerState}}.
# Suggest: I'd consider removing the dtsm's recover method.  Perhaps 
{{loadState}} can take the dtsm as an argument and directly populate it instead 
of populating an intermediary {{HistoryServerState}} object before populating 
the dtsm.

+JobHistoryServer+
# Bug: Is the {{stateStore}} going to be started twice?  Once in 
{{startService}} if {{recoveryEnabled}}, again by {{super#startService}} when 
it iterates the composite services?
# Bug: Should the state store service be started after being recovered?  Not 
before?
# Suggest: Perhaps the {{stateStore}} should be conditionally created  
registered in {{serviceInit}}, then having the state loading all in 
{{stateStore#start}} invoked by the composite service.  Then there's no need 
for additional logic in {{JHS#serviceStart}}.  Just an idea.

+JobHistoryStateStoreFactory+
# Bug? Seems a bit odd if recovery is enabled but there's no class defined, a 
{{HistoryServerNullStateStore}} is created.  It appears {{JHS#serviceStart}} 
will fail when it calls {{loadState}} and an {{UnsupportedOperationException}} 
is thrown.  The null store seems to have no real value other than deferring an 
error from {{JHS#serviceInit}} to {{JHS#serviceStart}}?
# Suggest: It feels like {{JobHistoryServer}} should only create  register a 
state store if required - which ties in with the prior comment in JHS.  
{{serviceInit}} only asks the factory for a state store if recovery is enabled. 
 The factory throws if no class is defined.

 Support token-preserving restart of history server
 --

 Key: MAPREDUCE-5332
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: jobhistoryserver
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, 
 MAPREDUCE-5332.patch


 To better support rolling upgrades through a cluster, the history server 
 needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5332:
--

Attachment: MAPREDUCE-5332-4.patch

Updated patch to address Daryn's comments.  Summary of changes:

* Prefixes prefixed
* getBucketPath div-to-mod fix
* state stores renamed to state store services

 Support token-preserving restart of history server
 --

 Key: MAPREDUCE-5332
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: jobhistoryserver
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, 
 MAPREDUCE-5332-4.patch, MAPREDUCE-5332.patch


 To better support rolling upgrades through a cluster, the history server 
 needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5332) Support token-preserving restart of history server

[
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764594#comment-13764594
]

Jason Lowe commented on MAPREDUCE-5332:
---

Thanks for the review, Daryn!

bq. Suggest: It may be clearer to rename the TOKEN_FOO_PREFIX constants to be
TOKEN_FOO_DIR_PREFIX or TOKEN_*_FILE_PREFIX.

Will do.

bq. Suggest: I'd consider not having the hardcoded ROOT_STATE_DIR_NAME added to
the user's path configured by MR_HS_FS_STATE_STORE_URI.

I followed the precedent set by FileSystemRMStateStore. It seems safer to add
a possibly redundant directory level in case the user configured the URI to a
public directory intended for other users (e.g.: something like /tmp).

bq. Question: In startStateStorage, why 2 mkdirs instead of 1?

This is primarly because mkdirs isn't guaranteed to set the permissions of any
created parent directories properly. I believe HDFS does this, but
RawLocalFileSystem does not.

bq. Bug: Unlike HistoryServerMemStateStore, there appear to be no checks for
things being added twice - although arguably those checks all belong in ADTSM.
Token check is there, but not a secret check. I think the state stores should
behave consistently.

createFile() should fail if the file is already present, so the file system
store should correctly fail if a token or key is added redundantly.

bq. Bug: In getBucketPath, I think you want to mod (%) the seq number instead
of dividing? Otherwise it creates a janitorial job for someone to clean up
empty directories.

Good catch!

bq. Suggest: For clarify, perhaps rename to HistoryServerStateStore*Service*. I
kept getting it confused with HistoryServerState.

Will do.

bq. Suggest: I'd consider removing the dtsm's recover method. Perhaps loadState
can take the dtsm as an argument and directly populate it instead of populating
an intermediary HistoryServerState object before populating the dtsm.

I'm following RMStateStore precedent here as well. I believe the intent is for
the state object to shield the state stores from knowledge of the secret
manager internals and vice-versa.

{quote}
Bug: Is the stateStore going to be started twice? Once in startService if
recoveryEnabled, again by super#startService when it iterates the composite
services?
Bug: Should the state store service be started after being recovered? Not
before?
{quote}

The stateStore needs to be started before recovery can occur, and starting a
service twice is a no-op. So it should work OK as written. However I'll clean
up the out-of-band state store start with a simple recovery service that is
added to the composite service right after the state store. The recovery
service's start method can check if recovery is enabled and perform the
recovery on the (now started) state store before the other services are started.

bq. Bug? Seems a bit odd if recovery is enabled but there's no class defined, a
HistoryServerNullStateStore is created. It appears JHS#serviceStart will fail
when it calls loadState and an UnsupportedOperationException is thrown. The
null store seems to have no real value other than deferring an error from
JHS#serviceInit to JHS#serviceStart?

The null store is necessary to avoid is-recovery-enabled checks throughout the
secret manager as it updates state. It fails on recovery to catch the scenario
where the user enabled recovery but forgot to configure a state store.

Support token-preserving restart of history server
--

Key: MAPREDUCE-5332
URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
Project: Hadoop Map/Reduce
Issue Type: New Feature
Components: jobhistoryserver
Reporter: Jason Lowe
Assignee: Jason Lowe
Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch,
MAPREDUCE-5332.patch

To better support rolling upgrades through a cluster, the history server
needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5332) Support token-preserving restart of history server

2013-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764709#comment-13764709
 ] 

Hadoop QA commented on MAPREDUCE-5332:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12602618/MAPREDUCE-5332-4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

  org.apache.hadoop.mapreduce.TestMRJobClient

  The following test timeouts occurred in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

org.apache.hadoop.mapreduce.v2.TestUberAM

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3993//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3993//console

This message is automatically generated.

 Support token-preserving restart of history server
 --

 Key: MAPREDUCE-5332
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: jobhistoryserver
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, 
 MAPREDUCE-5332-4.patch, MAPREDUCE-5332.patch


 To better support rolling upgrades through a cluster, the history server 
 needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-5504) mapred queue -info inconsistent with types

2013-09-11 Thread Thomas Graves (JIRA)

Thomas Graves created MAPREDUCE-5504:


 Summary: mapred queue -info inconsistent with types
 Key: MAPREDUCE-5504
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5504
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 0.23.9
Reporter: Thomas Graves


$ mapred queue -info default
==
Queue Name : default
Queue State : running
Scheduling Info : Capacity: 4.0, MaximumCapacity: 0.67, CurrentCapacity: 
0.9309831


The capacity is displayed in % as 4, however maximum capacity is displayed as 
an absolute number 0.67 instead of 67%.

We should make these consistent with the type we are displaying

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-5503) TestMRJobClient.testJobClient is failing

Jason Lowe created MAPREDUCE-5503:
-

 Summary: TestMRJobClient.testJobClient is failing
 Key: MAPREDUCE-5503
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5503
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.0.0
Reporter: Jason Lowe


TestMRJobClient.testJobClient is failing on trunk and causing precommit builds 
to complain:

{noformat}
testJobClient(org.apache.hadoop.mapreduce.TestMRJobClient)  Time elapsed: 
26.361 sec   FAILURE!
junit.framework.AssertionFailedError: expected:1 but was:0
at junit.framework.Assert.fail(Assert.java:50)
at junit.framework.Assert.failNotEquals(Assert.java:287)
at junit.framework.Assert.assertEquals(Assert.java:67)
at junit.framework.Assert.assertEquals(Assert.java:199)
at junit.framework.Assert.assertEquals(Assert.java:205)
at 
org.apache.hadoop.mapreduce.TestMRJobClient.testJobList(TestMRJobClient.java:474)
at 
org.apache.hadoop.mapreduce.TestMRJobClient.testJobClient(TestMRJobClient.java:112)
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5332) Support token-preserving restart of history server


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764781#comment-13764781
 ] 

Jason Lowe commented on MAPREDUCE-5332:
---

The test failures are unrelated.  TestUberAM still hasn't been fixed, see 
MAPREDUCE-5481.  I can reproduce the TestMRJobClient failure on trunk, filed 
MAPREDUCE-5503.

 Support token-preserving restart of history server
 --

 Key: MAPREDUCE-5332
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: jobhistoryserver
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, 
 MAPREDUCE-5332-4.patch, MAPREDUCE-5332.patch


 To better support rolling upgrades through a cluster, the history server 
 needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov resolved MAPREDUCE-5501.


Resolution: Won't Fix

This is caused by a bug in MiniYARNCluster. Reported in YARN-1183

 RMContainer Allocator does not stop when cluster shutdown is performed in 
 tests
 ---

 Key: MAPREDUCE-5501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: trunk
Reporter: Andrey Klochkov
 Attachments: hanging-rmcontainer-allocator.stdout, 
 hanging-rmcontainer-allocator.syslog


 After running MR job client tests many MRAppMaster processes stay alive. The 
 reason seems that RMContainer Allocator thread ignores InterruptedException 
 and keeps retrying:
 {code}
 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] 
 org.apache.hadoop.util.ThreadUtil: interrupted while sleeping
 java.lang.InterruptedException: sleep interrupted
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149)
 at com.sun.proxy.$Proxy29.allocate(Unknown Source)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 {code}
 It takes  6 minutes for the processes to die, and this causes various issues 
 with tests which use the same DFS dir. 
 {code}
 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error 
 communicating with RM: Could not contact RM after 36 milliseconds.
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM 
 after 36 milliseconds.
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 {code}
 Will attach a thread dump separately. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov reassigned MAPREDUCE-5501:
--

Assignee: Andrey Klochkov

 RMContainer Allocator does not stop when cluster shutdown is performed in 
 tests
 ---

 Key: MAPREDUCE-5501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: trunk
Reporter: Andrey Klochkov
Assignee: Andrey Klochkov
 Attachments: hanging-rmcontainer-allocator.stdout, 
 hanging-rmcontainer-allocator.syslog


 After running MR job client tests many MRAppMaster processes stay alive. The 
 reason seems that RMContainer Allocator thread ignores InterruptedException 
 and keeps retrying:
 {code}
 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] 
 org.apache.hadoop.util.ThreadUtil: interrupted while sleeping
 java.lang.InterruptedException: sleep interrupted
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149)
 at com.sun.proxy.$Proxy29.allocate(Unknown Source)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 {code}
 It takes  6 minutes for the processes to die, and this causes various issues 
 with tests which use the same DFS dir. 
 {code}
 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error 
 communicating with RM: Could not contact RM after 36 milliseconds.
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM 
 after 36 milliseconds.
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 {code}
 Will attach a thread dump separately. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Status: Patch Available  (was: Open)

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, 
 MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core