[jira] [Commented] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs

2014-03-23 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944353#comment-13944353
 ] 

Jian He commented on YARN-1852:
---

Thanks Rohith for the patch !
Patch looks good.  Did minor modification myself to remove some duplicate 
asserts.

 Application recovery throws InvalidStateTransitonException for FAILED and 
 KILLED jobs
 -

 Key: YARN-1852
 URL: https://issues.apache.org/jira/browse/YARN-1852
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0, 2.4.0
Reporter: Rohith
Assignee: Rohith
 Attachments: YARN-1852.patch


 Recovering for failed/killed application throw InvalidStateTransitonException.
 These are logged during recovery of applications.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs

2014-03-23 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1852:
--

Attachment: YARN-1852.2.patch

 Application recovery throws InvalidStateTransitonException for FAILED and 
 KILLED jobs
 -

 Key: YARN-1852
 URL: https://issues.apache.org/jira/browse/YARN-1852
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0, 2.4.0
Reporter: Rohith
Assignee: Rohith
 Attachments: YARN-1852.2.patch, YARN-1852.patch


 Recovering for failed/killed application throw InvalidStateTransitonException.
 These are logged during recovery of applications.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs

2014-03-23 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944539#comment-13944539
 ] 

Jian He commented on YARN-1852:
---

Hi [~rohithsharma], just walked through the code again. We should not send the 
ATTEMPT_FAILED/ATTEMPT_KILLED events, if the app was supposed to recover to the 
final state. We should send the events only if the app was not able to recover 
it self. I think the following RMAppImpl.isAppInFinalState has some problem, 
it's checking against the move-to state, while by the time this method is 
called, the app has not yet moved to this state. We may check against 
RMApp.recoveredFinalState state instead?
{code}
// We will replay the final attempt only if last attempt is in final
// state but application is not in final state.
if (rmApp.getCurrentAppAttempt() == appAttempt
 !RMAppImpl.isAppInFinalState(rmApp)
{code}

 Application recovery throws InvalidStateTransitonException for FAILED and 
 KILLED jobs
 -

 Key: YARN-1852
 URL: https://issues.apache.org/jira/browse/YARN-1852
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0, 2.4.0
Reporter: Rohith
Assignee: Rohith
 Attachments: YARN-1852.2.patch, YARN-1852.patch


 Recovering for failed/killed application throw InvalidStateTransitonException.
 These are logged during recovery of applications.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation

2014-03-23 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944632#comment-13944632
 ] 

Xuan Gong commented on YARN-1521:
-

updated:
* mark renewDelegationToken,  cancelDelegationToken, updateNodeResource and  
moveApplicationAcrossQueues  as Idempotent
* change submitApplication from AtMostOnce to Idempotetn
* change nodeHeartbeat from Idempotent to AtMostOnce

 Mark appropriate protocol methods with the idempotent annotation or 
 AtMostOnce annotation
 -

 Key: YARN-1521
 URL: https://issues.apache.org/jira/browse/YARN-1521
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong

 After YARN-1028, we add the automatically failover into RMProxy. This JIRA is 
 to identify whether we need to add idempotent annotation and which methods 
 can be marked as idempotent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation

2014-03-23 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1521:


Attachment: YARN-1521.0.patch

 Mark appropriate protocol methods with the idempotent annotation or 
 AtMostOnce annotation
 -

 Key: YARN-1521
 URL: https://issues.apache.org/jira/browse/YARN-1521
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1521.0.patch


 After YARN-1028, we add the automatically failover into RMProxy. This JIRA is 
 to identify whether we need to add idempotent annotation and which methods 
 can be marked as idempotent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation

2014-03-23 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944633#comment-13944633
 ] 

Xuan Gong commented on YARN-1521:
-

The patch includes:
* Mark appropriate protocol methods with idempotent and atmostonce annotation 
based on the proposal.
* Create testcases to test the annotation marked
** Limited scope: For all the testcases, we only test whether the method will 
be re-entry when failover happens. Does not cover the entire logic test. 
** Test strategy: create a separate failover thread with a trigger flag, 
override all APIs that added trigger flag. When the apis are called, we will 
set trigger flag as true to kick off the failover. So We can make sure the 
failover happens during process of the method. If this API is marked as 
idempotent or atmostonce, the testcases will pass; otherwise, they will throw 
the exception.
** Did not add testcases for ResourceManagerAdministrationProtocol. All 
refresh* will be called during the process of transitionToActive that will 
break the test strategy I used here. But I did the manually testing. Simply add 
sleep thread into the refresh*, and verified that all refresh* apes can be 
re-entry when failover happens after we marked all refresh* as idempotent.


 Mark appropriate protocol methods with the idempotent annotation or 
 AtMostOnce annotation
 -

 Key: YARN-1521
 URL: https://issues.apache.org/jira/browse/YARN-1521
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1521.0.patch


 After YARN-1028, we add the automatically failover into RMProxy. This JIRA is 
 to identify whether we need to add idempotent annotation and which methods 
 can be marked as idempotent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1776) renewDelegationToken should survive RM failover

2014-03-23 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944659#comment-13944659
 ] 

Tsuyoshi OZAWA commented on YARN-1776:
--

Sorry for the delay because I had a flight last weekend.

{code}
we do not support fencing yet
{code}

[~kkambatl], ah, I see. Agree with a latest patch. Thank you for the point.
[~zjshen], thanks for your work!

 renewDelegationToken should survive RM failover
 ---

 Key: YARN-1776
 URL: https://issues.apache.org/jira/browse/YARN-1776
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.4.0

 Attachments: YARN-1776.1.patch, YARN-1776.2.patch, YARN-1776.3.patch, 
 YARN-1776.4.patch, YARN-1776.5.patch, YARN-1776.6.patch


 When a delegation token is renewed, two RMStateStore operations: 1) removing 
 the old DT, and 2) storing the new DT will happen. If RM fails in between. 
 There would be problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation

2014-03-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944675#comment-13944675
 ] 

Hadoop QA commented on YARN-1521:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636277/YARN-1521.0.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3439//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3439//console

This message is automatically generated.

 Mark appropriate protocol methods with the idempotent annotation or 
 AtMostOnce annotation
 -

 Key: YARN-1521
 URL: https://issues.apache.org/jira/browse/YARN-1521
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1521.0.patch


 After YARN-1028, we add the automatically failover into RMProxy. This JIRA is 
 to identify whether we need to add idempotent annotation and which methods 
 can be marked as idempotent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1670) aggregated log writer can write more log data then it says is the log length

2014-03-23 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-1670:


Attachment: YARN-1670-v4-b23.patch

 aggregated log writer can write more log data then it says is the log length
 

 Key: YARN-1670
 URL: https://issues.apache.org/jira/browse/YARN-1670
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.10, 2.2.0
Reporter: Thomas Graves
Assignee: Mit Desai
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, 
 YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, 
 YARN-1670-v4-b23.patch, YARN-1670.patch, YARN-1670.patch


 We have seen exceptions when using 'yarn logs' to read log files. 
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.parseLong(Long.java:483)
at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246)
 We traced it down to the reader trying to read the file type of the next file 
 but where it reads is still log data from the previous file.  What happened 
 was the Log Length was written as a certain size but the log data was 
 actually longer then that.  
 Inside of the write() routine in LogValue it first writes what the logfile 
 length is, but then when it goes to write the log itself it just goes to the 
 end of the file.  There is a race condition here where if someone is still 
 writing to the file when it goes to be aggregated the length written could be 
 to small.
 We should have the write() routine stop when it writes whatever it said was 
 the length.  It would be nice if we could somehow tell the user it might be 
 truncated but I'm not sure of a good way to do this.
 We also noticed that a bug in readAContainerLogsForALogType where it is using 
 an int for curRead whereas it should be using a long. 
   while (len != -1  curRead  fileLength) {
 This isn't actually a problem right now as it looks like the underlying 
 decoder is doing the right thing and the len condition exits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1670) aggregated log writer can write more log data then it says is the log length

2014-03-23 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-1670:


Attachment: YARN-1670-v4.patch

[~tgraves], [~jeagles] and [~vinodkv], I am adding new patch. I have included 
the check in the while loop to make sure that we do not write the whole buffer 
if the last iteration has the file contents less than buffer size.

 aggregated log writer can write more log data then it says is the log length
 

 Key: YARN-1670
 URL: https://issues.apache.org/jira/browse/YARN-1670
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.10, 2.2.0
Reporter: Thomas Graves
Assignee: Mit Desai
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, 
 YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, 
 YARN-1670-v4-b23.patch, YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch


 We have seen exceptions when using 'yarn logs' to read log files. 
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.parseLong(Long.java:483)
at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246)
 We traced it down to the reader trying to read the file type of the next file 
 but where it reads is still log data from the previous file.  What happened 
 was the Log Length was written as a certain size but the log data was 
 actually longer then that.  
 Inside of the write() routine in LogValue it first writes what the logfile 
 length is, but then when it goes to write the log itself it just goes to the 
 end of the file.  There is a race condition here where if someone is still 
 writing to the file when it goes to be aggregated the length written could be 
 to small.
 We should have the write() routine stop when it writes whatever it said was 
 the length.  It would be nice if we could somehow tell the user it might be 
 truncated but I'm not sure of a good way to do this.
 We also noticed that a bug in readAContainerLogsForALogType where it is using 
 an int for curRead whereas it should be using a long. 
   while (len != -1  curRead  fileLength) {
 This isn't actually a problem right now as it looks like the underlying 
 decoder is doing the right thing and the len condition exits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1670) aggregated log writer can write more log data then it says is the log length

2014-03-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944705#comment-13944705
 ] 

Hadoop QA commented on YARN-1670:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636288/YARN-1670-v4.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3440//console

This message is automatically generated.

 aggregated log writer can write more log data then it says is the log length
 

 Key: YARN-1670
 URL: https://issues.apache.org/jira/browse/YARN-1670
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.10, 2.2.0
Reporter: Thomas Graves
Assignee: Mit Desai
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, 
 YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, 
 YARN-1670-v4-b23.patch, YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch


 We have seen exceptions when using 'yarn logs' to read log files. 
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.parseLong(Long.java:483)
at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246)
 We traced it down to the reader trying to read the file type of the next file 
 but where it reads is still log data from the previous file.  What happened 
 was the Log Length was written as a certain size but the log data was 
 actually longer then that.  
 Inside of the write() routine in LogValue it first writes what the logfile 
 length is, but then when it goes to write the log itself it just goes to the 
 end of the file.  There is a race condition here where if someone is still 
 writing to the file when it goes to be aggregated the length written could be 
 to small.
 We should have the write() routine stop when it writes whatever it said was 
 the length.  It would be nice if we could somehow tell the user it might be 
 truncated but I'm not sure of a good way to do this.
 We also noticed that a bug in readAContainerLogsForALogType where it is using 
 an int for curRead whereas it should be using a long. 
   while (len != -1  curRead  fileLength) {
 This isn't actually a problem right now as it looks like the underlying 
 decoder is doing the right thing and the len condition exits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1670) aggregated log writer can write more log data then it says is the log length

2014-03-23 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944720#comment-13944720
 ] 

Jonathan Eagles commented on YARN-1670:
---

Thanks, [~mdesai]. The above logic seems correct, now. Two minor things.
 - If we move from a count up byte counter to a count down byte counter,  does 
this seem easier to understand?
{code}
long bytesLeft = file.length();
while (len = in.read(buf)) != -1) {
  //If buffer contents within fileLength, write
  if (len  bytesLeft) {
out.write(buf, 0, len);
bytesLeft -= len;
  }
  //else only write contents that are within fileLength, then exit early 
  else {
out.write(buf, 0, (int)bytesLeft);
break;
  }
}
{code}
 - I see the buffer size of 65535 being used (I know, not your code). I wonder 
if this is really intended to be block aligned (64K) since that will result in 
theoretical optimal read performance.

 aggregated log writer can write more log data then it says is the log length
 

 Key: YARN-1670
 URL: https://issues.apache.org/jira/browse/YARN-1670
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.10, 2.2.0
Reporter: Thomas Graves
Assignee: Mit Desai
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, 
 YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, 
 YARN-1670-v4-b23.patch, YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch


 We have seen exceptions when using 'yarn logs' to read log files. 
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.parseLong(Long.java:483)
at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246)
 We traced it down to the reader trying to read the file type of the next file 
 but where it reads is still log data from the previous file.  What happened 
 was the Log Length was written as a certain size but the log data was 
 actually longer then that.  
 Inside of the write() routine in LogValue it first writes what the logfile 
 length is, but then when it goes to write the log itself it just goes to the 
 end of the file.  There is a race condition here where if someone is still 
 writing to the file when it goes to be aggregated the length written could be 
 to small.
 We should have the write() routine stop when it writes whatever it said was 
 the length.  It would be nice if we could somehow tell the user it might be 
 truncated but I'm not sure of a good way to do this.
 We also noticed that a bug in readAContainerLogsForALogType where it is using 
 an int for curRead whereas it should be using a long. 
   while (len != -1  curRead  fileLength) {
 This isn't actually a problem right now as it looks like the underlying 
 decoder is doing the right thing and the len condition exits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation

2014-03-23 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944723#comment-13944723
 ] 

Karthik Kambatla commented on YARN-1521:


I would like to take a closer look at the annotations before this gets 
committed. If not urgent, please wait for me until Tuesday.

 Mark appropriate protocol methods with the idempotent annotation or 
 AtMostOnce annotation
 -

 Key: YARN-1521
 URL: https://issues.apache.org/jira/browse/YARN-1521
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1521.0.patch


 After YARN-1028, we add the automatically failover into RMProxy. This JIRA is 
 to identify whether we need to add idempotent annotation and which methods 
 can be marked as idempotent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart

2014-03-23 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944730#comment-13944730
 ] 

Tsuyoshi OZAWA commented on YARN-556:
-

[~jianhe], your approach looks good to me. We can test new features with the 
updated protocol. About the NM side, we can choose switch on/off the NM resync 
by using configuration.

[~kkambatl] and [~adhoot], can you attach prototype source code to JIRAs? I'd 
like to contribute this JIRA and work with you.

 RM Restart phase 2 - Work preserving restart
 

 Key: YARN-556
 URL: https://issues.apache.org/jira/browse/YARN-556
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: Work Preserving RM Restart.pdf


 YARN-128 covered storing the state needed for the RM to recover critical 
 information. This umbrella jira will track changes needed to recover the 
 running state of the cluster so that work can be preserved across RM restarts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)