Re: [DISCUSS] Looking to a 2.9.0 release

2017-09-07 Thread Vrushali C
As I mentioned in the [VOTE] thread at [1],  for Timeline Service v2, we
are thinking about merging to branch2 some time in the next couple of weeks.


So far, we have been maintaining a branch2 based YARN-5355_branch2 along
with our trunk based feature branch YARN-5355. Varun Saxena has been
diligently rebasing it to stay current with branch2.

Currently, we are in the process of testing it just like we did our due
diligence with the trunk based YARN-5355 branch and will ensure the TSv2
branch2 code is a stable state to be merged.

We are also looking into back porting the new yarn-ui to branch2 along with
Sunil Govind. This is work in progress [2] and needs some testing as well.

thanks

Vrushali

[1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg27734.html

[2] https://issues.apache.org/jira/browse/YARN-7169

On Thu, Sep 7, 2017 at 3:25 PM, Iñigo Goiri  wrote:

> Hi Subru,
> We are also discussing the merge of HDFS-10467 (Router-based federation)
> and we would like to target 2.9 to do a full release together with YARN
> federation.
> Chris Douglas already arranged the integration into trunk for 3.0.0 GA.
>
> Regarding the points to cover:
> 1. API compatibility: we just extend ClientProtocol so no changes in the
> API.
> 2. Turning feature off: if the Router is not started, the feature is
> disabled completely.
> 3. Stability/testing: the internal version is heavily tested. We will
> start testing the OSS version soon. In any case, the feature is isolated
> and minor bugs will not affect anybody else other than the users of the
> feature.
> 4. Deployment: we are currently using 2.7.1 and we would like to switch to
> 2.9 when available.
> 5. Timeline: finishing the UI and the security JIRAs in HDFS-10467 should
> give us a ready to use version. There will be small features added but
> nothing major. There are a couple minor issues with the merge
> (e.g., HDFS-12384) but should be worked out soon.
>
> Thanks,
> Inigo
>
>
> On Tue, Sep 5, 2017 at 4:26 PM, Jonathan Hung 
> wrote:
>
>> Hi Subru,
>>
>> Thanks for starting the discussion. We are targeting merging YARN-5734
>> (API-based scheduler configuration) to branch-2 before the release of
>> 2.9.0, since the feature is close to complete. Regarding the requirements
>> for merge,
>>
>> 1. API compatibility - this feature adds new APIs, does not modify any
>> existing ones.
>> 2. Turning feature off - using the feature is configurable and is turned
>> off by default.
>> 3. Stability/testing - this is an RM-only change, so we plan on deploying
>> this feature to a test RM and verifying configuration changes for capacity
>> scheduler. (Right now fair scheduler is not supported.)
>> 4. Deployment - we want to get this feature in to 2.9.0 since we want to
>> use this feature and 2.9 version in our next upgrade.
>> 5. Timeline - we have one main blocker which we are planning to resolve by
>> end of week. The rest of the month will be testing then a merge vote on
>> the
>> last week of Sept.
>>
>> Please let me know if you have any concerns. Thanks!
>>
>>
>> Jonathan Hung
>>
>> On Wed, Jul 26, 2017 at 11:23 AM, J. Rottinghuis 
>> wrote:
>>
>> > Thanks Vrushali for being entirely open as to the current status of
>> ATSv2.
>> > I appreciate that we want to ensure things are tested at scale, and as
>> you
>> > said we are working on that right now on our clusters.
>> > We have tested the feature to demonstrate it works at what we consider
>> > moderate scale.
>> >
>> > I think the criteria for including this feature in the 2.9 release
>> should
>> > be if it can be safely turned off and not cause impact to anybody not
>> using
>> > the new feature. The confidence for this is high for timeline service
>> v2.
>> >
>> > Therefore, I think timeline service v2 should definitely be part of 2.9.
>> > That is the big draw for us to work on stabilizing a 2.9 release rather
>> > than just going to 2.8 and back-porting things ourselves.
>> >
>> > Thanks,
>> >
>> > Joep
>> >
>> > On Tue, Jul 25, 2017 at 11:39 PM, Vrushali Channapattan <
>> > vrushalic2...@gmail.com> wrote:
>> >
>> > > Thanks Subru for initiating this discussion.
>> > >
>> > > Wanted to share some thoughts in the context of Timeline Service v2.
>> The
>> > > current status of this module is that we are ramping up for a second
>> > merge
>> > > to trunk. We still have a few merge blocker jiras outstanding, which
>> we
>> > > think we will finish soon.
>> > >
>> > > While we have done some testing, we are yet to test at scale. Given
>> all
>> > > this, we were thinking of initially targeting a beta release vehicle
>> > rather
>> > > than a stable release.
>> > >
>> > > As such, timeline service v2 has branch-2 branch called as
>> > > YARN-5355-branch-2 in case anyone wants to try it out. Timeline
>> service
>> > v2
>> > > can be turned off and should not affect the cluster.
>> > >
>> > > thanks
>> > > Vrushali
>> > >
>> > >
>> > >
>> > 

Re: [DISCUSS] Looking to a 2.9.0 release

2017-09-07 Thread Iñigo Goiri
Hi Subru,
We are also discussing the merge of HDFS-10467 (Router-based federation)
and we would like to target 2.9 to do a full release together with YARN
federation.
Chris Douglas already arranged the integration into trunk for 3.0.0 GA.

Regarding the points to cover:
1. API compatibility: we just extend ClientProtocol so no changes in the
API.
2. Turning feature off: if the Router is not started, the feature is
disabled completely.
3. Stability/testing: the internal version is heavily tested. We will start
testing the OSS version soon. In any case, the feature is isolated and
minor bugs will not affect anybody else other than the users of the feature.
4. Deployment: we are currently using 2.7.1 and we would like to switch to
2.9 when available.
5. Timeline: finishing the UI and the security JIRAs in HDFS-10467 should
give us a ready to use version. There will be small features added but
nothing major. There are a couple minor issues with the merge
(e.g., HDFS-12384) but should be worked out soon.

Thanks,
Inigo


On Tue, Sep 5, 2017 at 4:26 PM, Jonathan Hung  wrote:

> Hi Subru,
>
> Thanks for starting the discussion. We are targeting merging YARN-5734
> (API-based scheduler configuration) to branch-2 before the release of
> 2.9.0, since the feature is close to complete. Regarding the requirements
> for merge,
>
> 1. API compatibility - this feature adds new APIs, does not modify any
> existing ones.
> 2. Turning feature off - using the feature is configurable and is turned
> off by default.
> 3. Stability/testing - this is an RM-only change, so we plan on deploying
> this feature to a test RM and verifying configuration changes for capacity
> scheduler. (Right now fair scheduler is not supported.)
> 4. Deployment - we want to get this feature in to 2.9.0 since we want to
> use this feature and 2.9 version in our next upgrade.
> 5. Timeline - we have one main blocker which we are planning to resolve by
> end of week. The rest of the month will be testing then a merge vote on the
> last week of Sept.
>
> Please let me know if you have any concerns. Thanks!
>
>
> Jonathan Hung
>
> On Wed, Jul 26, 2017 at 11:23 AM, J. Rottinghuis 
> wrote:
>
> > Thanks Vrushali for being entirely open as to the current status of
> ATSv2.
> > I appreciate that we want to ensure things are tested at scale, and as
> you
> > said we are working on that right now on our clusters.
> > We have tested the feature to demonstrate it works at what we consider
> > moderate scale.
> >
> > I think the criteria for including this feature in the 2.9 release should
> > be if it can be safely turned off and not cause impact to anybody not
> using
> > the new feature. The confidence for this is high for timeline service v2.
> >
> > Therefore, I think timeline service v2 should definitely be part of 2.9.
> > That is the big draw for us to work on stabilizing a 2.9 release rather
> > than just going to 2.8 and back-porting things ourselves.
> >
> > Thanks,
> >
> > Joep
> >
> > On Tue, Jul 25, 2017 at 11:39 PM, Vrushali Channapattan <
> > vrushalic2...@gmail.com> wrote:
> >
> > > Thanks Subru for initiating this discussion.
> > >
> > > Wanted to share some thoughts in the context of Timeline Service v2.
> The
> > > current status of this module is that we are ramping up for a second
> > merge
> > > to trunk. We still have a few merge blocker jiras outstanding, which we
> > > think we will finish soon.
> > >
> > > While we have done some testing, we are yet to test at scale. Given all
> > > this, we were thinking of initially targeting a beta release vehicle
> > rather
> > > than a stable release.
> > >
> > > As such, timeline service v2 has branch-2 branch called as
> > > YARN-5355-branch-2 in case anyone wants to try it out. Timeline service
> > v2
> > > can be turned off and should not affect the cluster.
> > >
> > > thanks
> > > Vrushali
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Jul 24, 2017 at 1:26 PM, Subru Krishnan 
> > wrote:
> > >
> > > > Folks,
> > > >
> > > > With the release for 2.8, we would like to look ahead to 2.9 release
> as
> > > > there are many features/improvements in branch-2 (about 1062
> commits),
> > > that
> > > > are in need of a release vechile.
> > > >
> > > > Here's our first cut of the proposal from the YARN side:
> > > >
> > > >1. Scheduler improvements (decoupling allocation from node
> > heartbeat,
> > > >allocation ID, concurrency fixes, LightResource etc).
> > > >2. Timeline Service v2
> > > >3. Opportunistic containers
> > > >4. Federation
> > > >
> > > > We would like to hear a formal list from HDFS & Hadoop (& MapReduce
> if
> > > any)
> > > > and will update the Roadmap wiki accordingly.
> > > >
> > > > Considering our familiarity with the above mentioned YARN features,
> we
> > > > would like to volunteer as the co-RMs for 2.9.0.
> > > >
> > > > We want to keep the timeline at 8-12 weeks to keep the release
> > 

Re: [DISCUSS] Looking to a 2.9.0 release

2017-09-05 Thread Jonathan Hung
Hi Subru,

Thanks for starting the discussion. We are targeting merging YARN-5734
(API-based scheduler configuration) to branch-2 before the release of
2.9.0, since the feature is close to complete. Regarding the requirements
for merge,

1. API compatibility - this feature adds new APIs, does not modify any
existing ones.
2. Turning feature off - using the feature is configurable and is turned
off by default.
3. Stability/testing - this is an RM-only change, so we plan on deploying
this feature to a test RM and verifying configuration changes for capacity
scheduler. (Right now fair scheduler is not supported.)
4. Deployment - we want to get this feature in to 2.9.0 since we want to
use this feature and 2.9 version in our next upgrade.
5. Timeline - we have one main blocker which we are planning to resolve by
end of week. The rest of the month will be testing then a merge vote on the
last week of Sept.

Please let me know if you have any concerns. Thanks!


Jonathan Hung

On Wed, Jul 26, 2017 at 11:23 AM, J. Rottinghuis 
wrote:

> Thanks Vrushali for being entirely open as to the current status of ATSv2.
> I appreciate that we want to ensure things are tested at scale, and as you
> said we are working on that right now on our clusters.
> We have tested the feature to demonstrate it works at what we consider
> moderate scale.
>
> I think the criteria for including this feature in the 2.9 release should
> be if it can be safely turned off and not cause impact to anybody not using
> the new feature. The confidence for this is high for timeline service v2.
>
> Therefore, I think timeline service v2 should definitely be part of 2.9.
> That is the big draw for us to work on stabilizing a 2.9 release rather
> than just going to 2.8 and back-porting things ourselves.
>
> Thanks,
>
> Joep
>
> On Tue, Jul 25, 2017 at 11:39 PM, Vrushali Channapattan <
> vrushalic2...@gmail.com> wrote:
>
> > Thanks Subru for initiating this discussion.
> >
> > Wanted to share some thoughts in the context of Timeline Service v2. The
> > current status of this module is that we are ramping up for a second
> merge
> > to trunk. We still have a few merge blocker jiras outstanding, which we
> > think we will finish soon.
> >
> > While we have done some testing, we are yet to test at scale. Given all
> > this, we were thinking of initially targeting a beta release vehicle
> rather
> > than a stable release.
> >
> > As such, timeline service v2 has branch-2 branch called as
> > YARN-5355-branch-2 in case anyone wants to try it out. Timeline service
> v2
> > can be turned off and should not affect the cluster.
> >
> > thanks
> > Vrushali
> >
> >
> >
> >
> >
> > On Mon, Jul 24, 2017 at 1:26 PM, Subru Krishnan 
> wrote:
> >
> > > Folks,
> > >
> > > With the release for 2.8, we would like to look ahead to 2.9 release as
> > > there are many features/improvements in branch-2 (about 1062 commits),
> > that
> > > are in need of a release vechile.
> > >
> > > Here's our first cut of the proposal from the YARN side:
> > >
> > >1. Scheduler improvements (decoupling allocation from node
> heartbeat,
> > >allocation ID, concurrency fixes, LightResource etc).
> > >2. Timeline Service v2
> > >3. Opportunistic containers
> > >4. Federation
> > >
> > > We would like to hear a formal list from HDFS & Hadoop (& MapReduce if
> > any)
> > > and will update the Roadmap wiki accordingly.
> > >
> > > Considering our familiarity with the above mentioned YARN features, we
> > > would like to volunteer as the co-RMs for 2.9.0.
> > >
> > > We want to keep the timeline at 8-12 weeks to keep the release
> pragmatic.
> > >
> > > Feedback?
> > >
> > > -Subru/Arun
> > >
> >
>


Re: [DISCUSS] Looking to a 2.9.0 release

2017-07-26 Thread J. Rottinghuis
Thanks Vrushali for being entirely open as to the current status of ATSv2.
I appreciate that we want to ensure things are tested at scale, and as you
said we are working on that right now on our clusters.
We have tested the feature to demonstrate it works at what we consider
moderate scale.

I think the criteria for including this feature in the 2.9 release should
be if it can be safely turned off and not cause impact to anybody not using
the new feature. The confidence for this is high for timeline service v2.

Therefore, I think timeline service v2 should definitely be part of 2.9.
That is the big draw for us to work on stabilizing a 2.9 release rather
than just going to 2.8 and back-porting things ourselves.

Thanks,

Joep

On Tue, Jul 25, 2017 at 11:39 PM, Vrushali Channapattan <
vrushalic2...@gmail.com> wrote:

> Thanks Subru for initiating this discussion.
>
> Wanted to share some thoughts in the context of Timeline Service v2. The
> current status of this module is that we are ramping up for a second merge
> to trunk. We still have a few merge blocker jiras outstanding, which we
> think we will finish soon.
>
> While we have done some testing, we are yet to test at scale. Given all
> this, we were thinking of initially targeting a beta release vehicle rather
> than a stable release.
>
> As such, timeline service v2 has branch-2 branch called as
> YARN-5355-branch-2 in case anyone wants to try it out. Timeline service v2
> can be turned off and should not affect the cluster.
>
> thanks
> Vrushali
>
>
>
>
>
> On Mon, Jul 24, 2017 at 1:26 PM, Subru Krishnan  wrote:
>
> > Folks,
> >
> > With the release for 2.8, we would like to look ahead to 2.9 release as
> > there are many features/improvements in branch-2 (about 1062 commits),
> that
> > are in need of a release vechile.
> >
> > Here's our first cut of the proposal from the YARN side:
> >
> >1. Scheduler improvements (decoupling allocation from node heartbeat,
> >allocation ID, concurrency fixes, LightResource etc).
> >2. Timeline Service v2
> >3. Opportunistic containers
> >4. Federation
> >
> > We would like to hear a formal list from HDFS & Hadoop (& MapReduce if
> any)
> > and will update the Roadmap wiki accordingly.
> >
> > Considering our familiarity with the above mentioned YARN features, we
> > would like to volunteer as the co-RMs for 2.9.0.
> >
> > We want to keep the timeline at 8-12 weeks to keep the release pragmatic.
> >
> > Feedback?
> >
> > -Subru/Arun
> >
>


Re: [DISCUSS] Looking to a 2.9.0 release

2017-07-26 Thread Vrushali Channapattan
Thanks Subru for initiating this discussion.

Wanted to share some thoughts in the context of Timeline Service v2. The
current status of this module is that we are ramping up for a second merge
to trunk. We still have a few merge blocker jiras outstanding, which we
think we will finish soon.

While we have done some testing, we are yet to test at scale. Given all
this, we were thinking of initially targeting a beta release vehicle rather
than a stable release.

As such, timeline service v2 has branch-2 branch called as
YARN-5355-branch-2 in case anyone wants to try it out. Timeline service v2
can be turned off and should not affect the cluster.

thanks
Vrushali





On Mon, Jul 24, 2017 at 1:26 PM, Subru Krishnan  wrote:

> Folks,
>
> With the release for 2.8, we would like to look ahead to 2.9 release as
> there are many features/improvements in branch-2 (about 1062 commits), that
> are in need of a release vechile.
>
> Here's our first cut of the proposal from the YARN side:
>
>1. Scheduler improvements (decoupling allocation from node heartbeat,
>allocation ID, concurrency fixes, LightResource etc).
>2. Timeline Service v2
>3. Opportunistic containers
>4. Federation
>
> We would like to hear a formal list from HDFS & Hadoop (& MapReduce if any)
> and will update the Roadmap wiki accordingly.
>
> Considering our familiarity with the above mentioned YARN features, we
> would like to volunteer as the co-RMs for 2.9.0.
>
> We want to keep the timeline at 8-12 weeks to keep the release pragmatic.
>
> Feedback?
>
> -Subru/Arun
>


[DISCUSS] Looking to a 2.9.0 release

2017-07-24 Thread Subru Krishnan
Folks,

With the release for 2.8, we would like to look ahead to 2.9 release as
there are many features/improvements in branch-2 (about 1062 commits), that
are in need of a release vechile.

Here's our first cut of the proposal from the YARN side:

   1. Scheduler improvements (decoupling allocation from node heartbeat,
   allocation ID, concurrency fixes, LightResource etc).
   2. Timeline Service v2
   3. Opportunistic containers
   4. Federation

We would like to hear a formal list from HDFS & Hadoop (& MapReduce if any)
and will update the Roadmap wiki accordingly.

Considering our familiarity with the above mentioned YARN features, we
would like to volunteer as the co-RMs for 2.9.0.

We want to keep the timeline at 8-12 weeks to keep the release pragmatic.

Feedback?

-Subru/Arun