[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2016-09-05 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15464330#comment-15464330
 ] 

Brahma Reddy Battula commented on YARN-666:
---

Sorry for coming late, I feel, it will be good if this needs to be documented  
like hdfs..?

> [Umbrella] Support rolling upgrades in YARN
> ---
>
> Key: YARN-666
> URL: https://issues.apache.org/jira/browse/YARN-666
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: graceful, rolling upgrade
>Affects Versions: 2.0.4-alpha
>Reporter: Siddharth Seth
> Fix For: 2.6.0
>
> Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf
>
>
> Jira to track changes required in YARN to allow rolling upgrades, including 
> documentation and possible upgrade routes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2014-05-10 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993506#comment-13993506
 ] 

Junping Du commented on YARN-666:
-

Link to two related JIRAs - work preserving during RM and NM restart.

 [Umbrella] Support rolling upgrades in YARN
 ---

 Key: YARN-666
 URL: https://issues.apache.org/jira/browse/YARN-666
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
 Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf


 Jira to track changes required in YARN to allow rolling upgrades, including 
 documentation and possible upgrade routes. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2013-06-17 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13686102#comment-13686102
 ] 

Siddharth Seth commented on YARN-666:
-

TBD - handling of Enum fields like AMCommand, NodeAction. This may be possible 
by forcing defaults if a new value needs to be added, alternately define a new 
Enum which is used by newer clients.

 [Umbrella] Support rolling upgrades in YARN
 ---

 Key: YARN-666
 URL: https://issues.apache.org/jira/browse/YARN-666
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
 Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf


 Jira to track changes required in YARN to allow rolling upgrades, including 
 documentation and possible upgrade routes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2013-05-16 Thread Lohit Vijayarenu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659686#comment-13659686
 ] 

Lohit Vijayarenu commented on YARN-666:
---

This looks good. Few minor point/JIRAs against metrics, reporting and UI pages 
updates with different version of yarn daemon should also be included. As 
Karthik already mentioned, it would be very useful if this followed HDFS-2983. 
This will become very useful for people who manage and do rolling upgrades on 
cluster.

Another question regarding draining of NodeManager. Do we have a concept of 
Blacklisting NodeManager today? Reason I ask is, if we know we can afford to 
kill running apps on nodemanager, but do not want new jobs to be submitted, one 
could potentially use blacklisting.

 [Umbrella] Support rolling upgrades in YARN
 ---

 Key: YARN-666
 URL: https://issues.apache.org/jira/browse/YARN-666
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
 Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf


 Jira to track changes required in YARN to allow rolling upgrades, including 
 documentation and possible upgrade routes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2013-05-16 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659709#comment-13659709
 ] 

Vinod Kumar Vavilapalli commented on YARN-666:
--

[~curino], thanks for the update, interesting stuff. I think we should pursue 
this route and do some experiments. Much much easier to do these experiments in 
2.x given the YARN and MR separation. May be there's already a ticket for this. 
Will it be possible to put up your changes however 'hacky' they might be?

[~lohit], we have per node health check monitoring which blocks bad nodes. 
There isn't any other concept of blacklisting NMs today, that is the reason for 
the proposal to add a decommission.

 [Umbrella] Support rolling upgrades in YARN
 ---

 Key: YARN-666
 URL: https://issues.apache.org/jira/browse/YARN-666
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
 Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf


 Jira to track changes required in YARN to allow rolling upgrades, including 
 documentation and possible upgrade routes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2013-05-16 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660040#comment-13660040
 ] 

Carlo Curino commented on YARN-666:
---

Vinod, I completely agree YARN/MR separation makes hacking around this much 
simpler.

As soon as we are done polishing/publishing the rest of 
checkpointing/preemption we will work on rebasing this code and we will post 
what we have. 
Also we are happy to socialize this, both development and experiments. For us 
this was a step towards cheaper checkpointing (as an hdfs-based shuffle 
is almost stateless for checkpoint purposes), but the performance wins are 
clearly interesting and there is quite a bit of variations you can think
of (e.g., a hybrid strategy using both streaming and localized data etc.. fun 
stuff). 

By the way some of the refactorings we propose in MAPREDUCE-5192 and 
MAPREDUCE-5194 are (aside from their use in checkpointing) useful towards this.

 [Umbrella] Support rolling upgrades in YARN
 ---

 Key: YARN-666
 URL: https://issues.apache.org/jira/browse/YARN-666
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
 Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf


 Jira to track changes required in YARN to allow rolling upgrades, including 
 documentation and possible upgrade routes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2013-05-15 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659024#comment-13659024
 ] 

Carlo Curino commented on YARN-666:
---

Hi Vinod, I will give you some numbers but bare in mind that these results are 
very initial, based only on a handful of runs on a 9 or 10 machine cluster, and 
without serious tuning of terasort. 

The idea of the solution is for maps to write their output directly into HDFS 
(e.g., with replication turned down to 1). Reducers will be started only when 
maps complete and stream-merge straight out of HDFS (bypassing much of the 
partial merging logic). 

Key limitations of what we have for now:
1) if a map output is lost, all reducers will have to wait for it to be re-run
2) we have lots of dfsclients open, this might become a problem for HDFS if you 
have too many maps per node. 

We initially tried this as a way to make checkpointing cheaper (no need to save 
any state other than last-processed key), and we were just hoping for it not 
too be too much worse than regular shuffle. The surprise I mentioned above was 
that we actually observe a surprisingly substantial speed up on a simple sort 
job (on 9 nodes): 25% at 64GB scale and 31% at 1TB scale. 

This seems to indicate that the penalty of reading through HDFS is actually 
trumped by the benefits of doing a stream-merge (where data never touch disk on 
the reduce side, other than for reducer output). Probably this is reducing 
seeks, and using the drives from which we read and we write more efficiently. 
You can imagine to get similar benefits by adding restartability to the http 
client (and the buffering done by HDFS client, which was likely to be 
beneficial in our test). More sophisticated versions of these could also 
dynamically decide whether to stream merge from a certain map or whether to 
copy the data (if for example they are small to fit in memory). 

Bottomline, I don't think we should read to much out these results (again very 
initial), other than using HDFS for intermediate data layer is not completely 
infeasible. 


 [Umbrella] Support rolling upgrades in YARN
 ---

 Key: YARN-666
 URL: https://issues.apache.org/jira/browse/YARN-666
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
 Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf


 Jira to track changes required in YARN to allow rolling upgrades, including 
 documentation and possible upgrade routes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2013-05-14 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13657337#comment-13657337
 ] 

Carlo Curino commented on YARN-666:
---

This seems a very important problem (and a very hard one too). 

Just to toss one more idea around: I think that an HDFS-based shuffle (we are 
playing around with it and performance are much better than expected) 
could simplify some of the problems, as we could piggyback on datanode 
decomissioning mechanics to migrate intermediate data out of a node being 
decomissioned. 
And (a bit obvious) preemption could be a good tool to make the draining fast 
without wasting work (the administrative scenarios we mentioned during the 
conversation in YARN-45). 

 [Umbrella] Support rolling upgrades in YARN
 ---

 Key: YARN-666
 URL: https://issues.apache.org/jira/browse/YARN-666
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
 Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf


 Jira to track changes required in YARN to allow rolling upgrades, including 
 documentation and possible upgrade routes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2013-05-14 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13657864#comment-13657864
 ] 

Vinod Kumar Vavilapalli commented on YARN-666:
--

bq. Just to toss one more idea around: I think that an HDFS-based shuffle (we 
are playing around with it and performance are much better than expected) 
Carlo, it will be great if you share some numbers :)

 [Umbrella] Support rolling upgrades in YARN
 ---

 Key: YARN-666
 URL: https://issues.apache.org/jira/browse/YARN-666
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
 Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf


 Jira to track changes required in YARN to allow rolling upgrades, including 
 documentation and possible upgrade routes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2013-05-13 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13656331#comment-13656331
 ] 

Siddharth Seth commented on YARN-666:
-

bq. Steps to upgrade a YARN cluster: do you think it would make sense to 
upgrade the NMs first before upgrading the RM. If something goes wrong 
(hopefully not), users can fall-back to the older version.
This really depends. There are situations which involve only an NM bug-fix. For 
such cases, the RM doesn't even need to be upgraded/restarted. Also depends on 
whether new APIs are being added to the RM which upgraded NMs may use.

bq. Considerations (Upgrading the MR runtime): Until YARN/MR go into separate 
projects and release cycles, upgrading YARN alone (say 2.1.0 to 2.1.2) 
shouldn't affect the clients (MR) - no?
This depends upon individual deployments. Sites may choose to deploy YARN/MR in 
a way where they can be upgraded independently. The same example - MR 2.1.2 
which contains AM/MR runtime fixes running against YARN 2.1.0. That's one of 
the main goals of MR being user-land code. Until work preserving restart is 
implemented, there should be a way to upgrade MR without affecting the cluster.

bq. I am assuming the version check will be similar to the one in HDFS-2983.
We can definitely learn from that - if we want to support more specific 
versions than just the ones on individual protcols. I don't think YARN has any 
version checks at the moment, other than the ones performed on API versions by 
the RPC layer. 

 [Umbrella] Support rolling upgrades in YARN
 ---

 Key: YARN-666
 URL: https://issues.apache.org/jira/browse/YARN-666
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
 Attachments: YARN_Rolling_Upgrades.pdf


 Jira to track changes required in YARN to allow rolling upgrades, including 
 documentation and possible upgrade routes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2013-05-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655036#comment-13655036
 ] 

Hitesh Shah commented on YARN-666:
--

+1 to getting this built out. As they say, the devil is in the details. 

 [Umbrella] Support rolling upgrades in YARN
 ---

 Key: YARN-666
 URL: https://issues.apache.org/jira/browse/YARN-666
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth

 Jira to track changes required in YARN to allow rolling upgrades, including 
 documentation and possible upgrade routes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2013-05-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655062#comment-13655062
 ] 

Karthik Kambatla commented on YARN-666:
---

Sid - thanks for creating this. Excited.

Just went over the design doc (which BTW is well-articulated) and have the 
following comments:
# Steps to upgrade a YARN cluster: do you think it would make sense to upgrade 
the NMs first before upgrading the RM. If something goes wrong (hopefully not), 
users can fall-back to the older version.
# Considerations (Upgrading the MR runtime): Until YARN/MR go into separate 
projects and release cycles, upgrading YARN alone (say 2.1.0 to 2.1.2) 
shouldn't affect the clients (MR) - no?
# Looks like we need to come up with an appropriate policy for YARN data 
formats in HADOOP-9517.
# I am assuming the version check will be similar to the one in HDFS-2983.
# Big +1 to drain decommission

 [Umbrella] Support rolling upgrades in YARN
 ---

 Key: YARN-666
 URL: https://issues.apache.org/jira/browse/YARN-666
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
 Attachments: YARN_Rolling_Upgrades.pdf


 Jira to track changes required in YARN to allow rolling upgrades, including 
 documentation and possible upgrade routes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira