Re: [DISCUSS] Looking to Apache Hadoop 3.1 release

2017-09-07 Thread Vinod Kumar Vavilapalli
Thanks for starting this thread, Wangda!

+1 for establishing a faster cadence now itself.

One word of caution though. The same I expressed while we were trying to do 
both 2.8 and 3.0 releases at the same time. Please try avoiding concurrent 
releases and splitting community bandwidth - it's not that we cannot do 
multiple releases in parallel, it's mainly that we will not be able to give our 
best on both.

Thanks
+Vinod

> On Sep 6, 2017, at 11:13 AM, Wangda Tan  wrote:
> 
> Hi all,
> 
> As we discussed on [1], there were proposals from Steve / Vinod etc to have
> a faster cadence of releases and to start thinking of a Hadoop 3.1 release
> earlier than March 2018 as is currently proposed.
> 
> I think this is a good idea. I'd like to start the process sooner, and
> establish timeline etc so that we can be ready when 3.0.0 GA is out. With
> this we can also establish faster cadence for future Hadoop 3.x releases.
> 
> To this end, I propose to target Hadoop 3.1.0 for a release by mid Jan
> 2018. (About 4.5 months from now and 2.5 months after 3.0-GA, instead of
> 6.5 months from now).
> 
> I'd also want to take this opportunity to come up with a more elaborate
> release plan to avoid some of the confusion we had with 3.0 beta. General
> proposal for the timeline (per this other proposal [2])
> - Feature freeze date - all features should be merged by Dec 15, 2017.
> - Code freeze date - blockers/critical only, no more improvements and non
> blocker/critical bug-fixes: Jan 1, 2018.
> - Release date: Jan 15, 2018
> 
> Following is a list of features on my radar which could be candidates for a
> 3.1 release:
> - YARN-5734, Dynamic scheduler queue configuration. (Owner: Jonathan Hung)
> - YARN-5881, Add absolute resource configuration to CapacityScheduler.
> (Owner: Sunil)
> - YARN-5673, Container-executor rewrite for better security, extensibility
> and portability. (Owner Varun Vasudev)
> - YARN-6223, GPU isolation. (Owner: Wangda)
> 
> And from email [3] mentioned by Andrew, there’re several other HDFS
> features want to be released with 3.1 as well, assuming they fit the
> timelines:
> - Storage Policy Satisfier
> - HDFS tiered storage
> 
> Please let me know if I missed any features targeted to 3.1 per this
> timeline.
> 
> And I want to volunteer myself as release manager of 3.1.0 release. Please
> let me know if you have any suggestions/concerns.
> 
> Thanks,
> Wangda Tan
> 
> [1] http://markmail.org/message/hwar5f5ap654ck5o?q=
> Branch+merges+and+3%2E0%2E0-beta1+scope
> [2] http://markmail.org/message/hwar5f5ap654ck5o?q=Branch+
> merges+and+3%2E0%2E0-beta1+scope#query:Branch%20merges%
> 20and%203.0.0-beta1%20scope+page:1+mid:2hqqkhl2dymcikf5+state:results
> [3] http://markmail.org/message/h35obzqrh3ag6dgn?q=Branch+merge
> s+and+3%2E0%2E0-beta1+scope


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [DISCUSS] Looking to a 2.9.0 release

2017-09-07 Thread Vrushali C
As I mentioned in the [VOTE] thread at [1],  for Timeline Service v2, we
are thinking about merging to branch2 some time in the next couple of weeks.


So far, we have been maintaining a branch2 based YARN-5355_branch2 along
with our trunk based feature branch YARN-5355. Varun Saxena has been
diligently rebasing it to stay current with branch2.

Currently, we are in the process of testing it just like we did our due
diligence with the trunk based YARN-5355 branch and will ensure the TSv2
branch2 code is a stable state to be merged.

We are also looking into back porting the new yarn-ui to branch2 along with
Sunil Govind. This is work in progress [2] and needs some testing as well.

thanks

Vrushali

[1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg27734.html

[2] https://issues.apache.org/jira/browse/YARN-7169

On Thu, Sep 7, 2017 at 3:25 PM, Iñigo Goiri  wrote:

> Hi Subru,
> We are also discussing the merge of HDFS-10467 (Router-based federation)
> and we would like to target 2.9 to do a full release together with YARN
> federation.
> Chris Douglas already arranged the integration into trunk for 3.0.0 GA.
>
> Regarding the points to cover:
> 1. API compatibility: we just extend ClientProtocol so no changes in the
> API.
> 2. Turning feature off: if the Router is not started, the feature is
> disabled completely.
> 3. Stability/testing: the internal version is heavily tested. We will
> start testing the OSS version soon. In any case, the feature is isolated
> and minor bugs will not affect anybody else other than the users of the
> feature.
> 4. Deployment: we are currently using 2.7.1 and we would like to switch to
> 2.9 when available.
> 5. Timeline: finishing the UI and the security JIRAs in HDFS-10467 should
> give us a ready to use version. There will be small features added but
> nothing major. There are a couple minor issues with the merge
> (e.g., HDFS-12384) but should be worked out soon.
>
> Thanks,
> Inigo
>
>
> On Tue, Sep 5, 2017 at 4:26 PM, Jonathan Hung 
> wrote:
>
>> Hi Subru,
>>
>> Thanks for starting the discussion. We are targeting merging YARN-5734
>> (API-based scheduler configuration) to branch-2 before the release of
>> 2.9.0, since the feature is close to complete. Regarding the requirements
>> for merge,
>>
>> 1. API compatibility - this feature adds new APIs, does not modify any
>> existing ones.
>> 2. Turning feature off - using the feature is configurable and is turned
>> off by default.
>> 3. Stability/testing - this is an RM-only change, so we plan on deploying
>> this feature to a test RM and verifying configuration changes for capacity
>> scheduler. (Right now fair scheduler is not supported.)
>> 4. Deployment - we want to get this feature in to 2.9.0 since we want to
>> use this feature and 2.9 version in our next upgrade.
>> 5. Timeline - we have one main blocker which we are planning to resolve by
>> end of week. The rest of the month will be testing then a merge vote on
>> the
>> last week of Sept.
>>
>> Please let me know if you have any concerns. Thanks!
>>
>>
>> Jonathan Hung
>>
>> On Wed, Jul 26, 2017 at 11:23 AM, J. Rottinghuis 
>> wrote:
>>
>> > Thanks Vrushali for being entirely open as to the current status of
>> ATSv2.
>> > I appreciate that we want to ensure things are tested at scale, and as
>> you
>> > said we are working on that right now on our clusters.
>> > We have tested the feature to demonstrate it works at what we consider
>> > moderate scale.
>> >
>> > I think the criteria for including this feature in the 2.9 release
>> should
>> > be if it can be safely turned off and not cause impact to anybody not
>> using
>> > the new feature. The confidence for this is high for timeline service
>> v2.
>> >
>> > Therefore, I think timeline service v2 should definitely be part of 2.9.
>> > That is the big draw for us to work on stabilizing a 2.9 release rather
>> > than just going to 2.8 and back-porting things ourselves.
>> >
>> > Thanks,
>> >
>> > Joep
>> >
>> > On Tue, Jul 25, 2017 at 11:39 PM, Vrushali Channapattan <
>> > vrushalic2...@gmail.com> wrote:
>> >
>> > > Thanks Subru for initiating this discussion.
>> > >
>> > > Wanted to share some thoughts in the context of Timeline Service v2.
>> The
>> > > current status of this module is that we are ramping up for a second
>> > merge
>> > > to trunk. We still have a few merge blocker jiras outstanding, which
>> we
>> > > think we will finish soon.
>> > >
>> > > While we have done some testing, we are yet to test at scale. Given
>> all
>> > > this, we were thinking of initially targeting a beta release vehicle
>> > rather
>> > > than a stable release.
>> > >
>> > > As such, timeline service v2 has branch-2 branch called as
>> > > YARN-5355-branch-2 in case anyone wants to try it out. Timeline
>> service
>> > v2
>> > > can be turned off and should not affect the cluster.
>> > >
>> > > thanks
>> > > Vrushali
>> > >
>> > >
>> > >
>> > 

2017-09-07 Hadoop 3 release status update

2017-09-07 Thread Andrew Wang
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

2017-09-07

Slightly early update since I'll be out tomorrow. We're one week out, and
focus is on blocker burndown.

Highlights:

   - 3.1.0 release planning is underway, led by Wangda. Target release date
   is in January.

Red flags:

   - YARN native services merge vote got a -1 for beta1, I recommended we
   drop it from beta1 and retarget for a later release.
   - 11 blockers on the dashboard, one more than last week [image: (sad)]

Previously tracked beta1 blockers that have been resolved or dropped:

   - HADOOP-14826 was duped to HADOOP-14738.
   - YARN-5536  (Multiple
   format support (JSON, etc.) for exclude node file in NM graceful
   decommission with timeout): Downgraded in priority in favor of YARN-7162
   which Robert has posted a patch for.
   - MAPREDUCE-6941 (The default setting doesn't work for MapReduce job): I
   resolved this and Junping confirmed this is fine.


beta1 blockers:

   - HADOOP-14738  (Remove
   S3N and obsolete bits of S3A; rework docs): Steve has been actively revving
   this with our new committer Aaron Fabbri ready to review. The scope has
   expanded from HADOOP-14826, so it's not just a doc update.
   - HADOOP-14284  (Shade
   Guava everywhere): No change since last week. This is an umbrella JIRA.
   - HADOOP-14771 
(hadoop-client
   does not include hadoop-yarn-client): Patch up, needs review, still waiting
   on Busbey. Bharat gave it a review.
   - YARN-7162  (Remove
   XML excludes file format): Robert has posted a patch and is waiting for a
   review.
   - HADOOP-14238 
(Rechecking
   Guava's object is not exposed to user-facing API): Bharat took this up and
   turned it into an umbrella.
  - HADOOP-14847
 (Remove
  Guava Supplier and change to java Supplier in AMRMClient and
  AMRMClientAysnc) Bharat posted a patch on a subtask to fix the
known Guava
  Supplier issue in AMRMClient. Needs a review.
   - HADOOP-14835  (mvn
   site build throws SAX errors): I'm working on this. Debugged it and have a
   proposed patch up, discussing with Allen.
   - HDFS-12218  (Rename
   split EC / replicated block metrics in BlockManager): I'm working on this,
   just need to commit it, already have a +1 from Eddy.


beta1 features:

   - Erasure coding
  - There are three must-dos, all being actively worked on.
  - HDFS-7859 is being actively reviewed and revved by Sammi and Kai
  and Eddy.
  - HDFS-12395 was split out of HDFS-7859 to do the edit log changes.
  - HDFS-12218 is discussed above.
   - Addressing incompatible changes (YARN-6142 and HDFS-11096)
   - Ray and Allen reviewed Sean's HDFS rolling upgrade scripts.
  - Sean did a run through of the HDFS JACC report and it looked fine.
   - Classpath isolation (HADOOP-11656)
  - Sean has retriaged the subtasks and has been posting patches.
   - Compat guide (HADOOP-13714
   )
  - Daniel has been collecting feedback on dev lists, but still needs a
  detailed review of the patch.
   - YARN native services
  - Jian sent out the merge vote, but it's been -1'd for beta1 by
  Allen. I propose we drop this from beta1 scope and retarget.
   - TSv2 alpha 2
   - This was merged, no problems thus far [image: (smile)]

GA features:

   - Resource profiles (Wangda Tan)
  - Merge vote was sent out. Since branch-3.0 has been cut, this can be
  merged to trunk (3.1.0) and then backported once we've completed testing.
   - HDFS router-based federation (Chris Douglas)
   - This is like YARN federation, very separate and doesn't add new APIs,
  run in production at MSFT.
  - If it passes Cloudera internal integration testing, I'm fine
  putting this in for GA.
   - API-based scheduler configuration (Jonathan Hung)
  - Jonathan mentioned that his main goal is to get this in for 2.9.0,
  which seems likely to go out after 3.0.0 GA since there hasn't been any
  serious release planning yet. Jonathan said that delaying this
until 3.1.0
  is fine.


Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-07 Thread Andrew Wang
Hi folks,

This vote closes today. I see a -1 from Allen on inclusion in beta1. I see
there's active fixing going on, but given that we're one week out from RC0,
I think we should drop this from beta1.

Allen, Jian, others, is this reasonable? What release should we retarget
this for? I don't have a sense for how much work there is left to do, but
as a reminder, we're planning GA for Nov 1st, and 3.1.0 for January.

Best,
Andrew

On Wed, Sep 6, 2017 at 10:19 AM, Jian He  wrote:

> >   Please correct me if I’m wrong, but the current summary of the
> branch, post these changes, looks like:
> Sorry for confusion, I was actively writing the formal documentation for
> how to use/how it works etc. and will post soon in a few hours.
>
>
> > On Sep 6, 2017, at 10:15 AM, Allen Wittenauer 
> wrote:
> >
> >
> >> On Sep 5, 2017, at 6:23 PM, Jian He  wrote:
> >>
> >>> If it doesn’t have all the bells and whistles, then it shouldn’t
> be on port 53 by default.
> >> Sure, I’ll change the default port to not use 53 and document it.
> >>> *how* is it getting launched on a privileged port? It sounds like
> the expectation is to run “command” as root.   *ALL* of the previous
> daemons in Hadoop that needed a privileged port used jsvc.  Why isn’t this
> one? These questions matter from a security standpoint.
> >> Yes, it is running as “root” to be able to use the privileged port. The
> DNS server is not yet integrated with the hadoop script.
> >>
> >>> Check the output.  It’s pretty obviously borked:
> >> Thanks for pointing out. Missed this when rebasing onto trunk.
> >
> >
> >   Please correct me if I’m wrong, but the current summary of the
> branch, post these changes, looks like:
> >
> >   * A bunch of mostly new Java code that may or may not have
> javadocs (post-revert YARN-6877, still working out HADOOP-14835)
> >   * ~1/3 of the docs are roadmap/TBD
> >   * ~1/3 of the docs are for an optional DNS daemon that has
> no end user hook to start it
> >   * ~1/3 of the docs are for a REST API that comes from some
> undefined daemon (apiserver?)
> >   * Two new, but undocumented, subcommands to yarn
> >   * There are no docs for admins or users on how to actually
> start or use this completely new/separate/optional feature
> >
> >   How are outside people (e.g., non-branch committers) supposed to
> test this new feature under these conditions?
> >
>
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>


Re: [DISCUSS] Looking to Apache Hadoop 3.1 release

2017-09-07 Thread Wangda Tan
Thanks for all your valuable feedbacks.

Regarding to security issues for alpha features: I completely agree with
Larry: ideally, all alpha features should be disabled by default.

Steve/Arun/Haibo: could you please comment about feature's rough merge plan
and status (like alpha/beta).

I will wait several days to see if there are any other features wanted to
be added to 3.1 before creating initial release scope/plan on confluent.

Best,
Wangda


On Thu, Sep 7, 2017 at 12:14 PM, Haibo Chen  wrote:

> Thanks Wangda for initiating 3.1.0 release efforts. One YARN feature I'd
> like to add to 3.1.0 is YARN Oversubscription (YARN-1011)
>
> Best,
> Haibo
>
> On Wed, Sep 6, 2017 at 11:13 AM, Wangda Tan  wrote:
>
>> Hi all,
>>
>> As we discussed on [1], there were proposals from Steve / Vinod etc to
>> have
>> a faster cadence of releases and to start thinking of a Hadoop 3.1 release
>> earlier than March 2018 as is currently proposed.
>>
>> I think this is a good idea. I'd like to start the process sooner, and
>> establish timeline etc so that we can be ready when 3.0.0 GA is out. With
>> this we can also establish faster cadence for future Hadoop 3.x releases.
>>
>> To this end, I propose to target Hadoop 3.1.0 for a release by mid Jan
>> 2018. (About 4.5 months from now and 2.5 months after 3.0-GA, instead of
>> 6.5 months from now).
>>
>> I'd also want to take this opportunity to come up with a more elaborate
>> release plan to avoid some of the confusion we had with 3.0 beta. General
>> proposal for the timeline (per this other proposal [2])
>>  - Feature freeze date - all features should be merged by Dec 15, 2017.
>>  - Code freeze date - blockers/critical only, no more improvements and non
>> blocker/critical bug-fixes: Jan 1, 2018.
>>  - Release date: Jan 15, 2018
>>
>> Following is a list of features on my radar which could be candidates for
>> a
>> 3.1 release:
>> - YARN-5734, Dynamic scheduler queue configuration. (Owner: Jonathan Hung)
>> - YARN-5881, Add absolute resource configuration to CapacityScheduler.
>> (Owner: Sunil)
>> - YARN-5673, Container-executor rewrite for better security, extensibility
>> and portability. (Owner Varun Vasudev)
>> - YARN-6223, GPU isolation. (Owner: Wangda)
>>
>> And from email [3] mentioned by Andrew, there’re several other HDFS
>> features want to be released with 3.1 as well, assuming they fit the
>> timelines:
>> - Storage Policy Satisfier
>> - HDFS tiered storage
>>
>> Please let me know if I missed any features targeted to 3.1 per this
>> timeline.
>>
>> And I want to volunteer myself as release manager of 3.1.0 release. Please
>> let me know if you have any suggestions/concerns.
>>
>> Thanks,
>> Wangda Tan
>>
>> [1] http://markmail.org/message/hwar5f5ap654ck5o?q=
>> Branch+merges+and+3%2E0%2E0-beta1+scope
>> [2] http://markmail.org/message/hwar5f5ap654ck5o?q=Branch+
>> merges+and+3%2E0%2E0-beta1+scope#query:Branch%20merges%
>> 20and%203.0.0-beta1%20scope+page:1+mid:2hqqkhl2dymcikf5+state:results
>> [3] http://markmail.org/message/h35obzqrh3ag6dgn?q=Branch+merge
>> s+and+3%2E0%2E0-beta1+scope
>>
>
>


Re: [DISCUSS] Looking to a 2.9.0 release

2017-09-07 Thread Iñigo Goiri
Hi Subru,
We are also discussing the merge of HDFS-10467 (Router-based federation)
and we would like to target 2.9 to do a full release together with YARN
federation.
Chris Douglas already arranged the integration into trunk for 3.0.0 GA.

Regarding the points to cover:
1. API compatibility: we just extend ClientProtocol so no changes in the
API.
2. Turning feature off: if the Router is not started, the feature is
disabled completely.
3. Stability/testing: the internal version is heavily tested. We will start
testing the OSS version soon. In any case, the feature is isolated and
minor bugs will not affect anybody else other than the users of the feature.
4. Deployment: we are currently using 2.7.1 and we would like to switch to
2.9 when available.
5. Timeline: finishing the UI and the security JIRAs in HDFS-10467 should
give us a ready to use version. There will be small features added but
nothing major. There are a couple minor issues with the merge
(e.g., HDFS-12384) but should be worked out soon.

Thanks,
Inigo


On Tue, Sep 5, 2017 at 4:26 PM, Jonathan Hung  wrote:

> Hi Subru,
>
> Thanks for starting the discussion. We are targeting merging YARN-5734
> (API-based scheduler configuration) to branch-2 before the release of
> 2.9.0, since the feature is close to complete. Regarding the requirements
> for merge,
>
> 1. API compatibility - this feature adds new APIs, does not modify any
> existing ones.
> 2. Turning feature off - using the feature is configurable and is turned
> off by default.
> 3. Stability/testing - this is an RM-only change, so we plan on deploying
> this feature to a test RM and verifying configuration changes for capacity
> scheduler. (Right now fair scheduler is not supported.)
> 4. Deployment - we want to get this feature in to 2.9.0 since we want to
> use this feature and 2.9 version in our next upgrade.
> 5. Timeline - we have one main blocker which we are planning to resolve by
> end of week. The rest of the month will be testing then a merge vote on the
> last week of Sept.
>
> Please let me know if you have any concerns. Thanks!
>
>
> Jonathan Hung
>
> On Wed, Jul 26, 2017 at 11:23 AM, J. Rottinghuis 
> wrote:
>
> > Thanks Vrushali for being entirely open as to the current status of
> ATSv2.
> > I appreciate that we want to ensure things are tested at scale, and as
> you
> > said we are working on that right now on our clusters.
> > We have tested the feature to demonstrate it works at what we consider
> > moderate scale.
> >
> > I think the criteria for including this feature in the 2.9 release should
> > be if it can be safely turned off and not cause impact to anybody not
> using
> > the new feature. The confidence for this is high for timeline service v2.
> >
> > Therefore, I think timeline service v2 should definitely be part of 2.9.
> > That is the big draw for us to work on stabilizing a 2.9 release rather
> > than just going to 2.8 and back-porting things ourselves.
> >
> > Thanks,
> >
> > Joep
> >
> > On Tue, Jul 25, 2017 at 11:39 PM, Vrushali Channapattan <
> > vrushalic2...@gmail.com> wrote:
> >
> > > Thanks Subru for initiating this discussion.
> > >
> > > Wanted to share some thoughts in the context of Timeline Service v2.
> The
> > > current status of this module is that we are ramping up for a second
> > merge
> > > to trunk. We still have a few merge blocker jiras outstanding, which we
> > > think we will finish soon.
> > >
> > > While we have done some testing, we are yet to test at scale. Given all
> > > this, we were thinking of initially targeting a beta release vehicle
> > rather
> > > than a stable release.
> > >
> > > As such, timeline service v2 has branch-2 branch called as
> > > YARN-5355-branch-2 in case anyone wants to try it out. Timeline service
> > v2
> > > can be turned off and should not affect the cluster.
> > >
> > > thanks
> > > Vrushali
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Jul 24, 2017 at 1:26 PM, Subru Krishnan 
> > wrote:
> > >
> > > > Folks,
> > > >
> > > > With the release for 2.8, we would like to look ahead to 2.9 release
> as
> > > > there are many features/improvements in branch-2 (about 1062
> commits),
> > > that
> > > > are in need of a release vechile.
> > > >
> > > > Here's our first cut of the proposal from the YARN side:
> > > >
> > > >1. Scheduler improvements (decoupling allocation from node
> > heartbeat,
> > > >allocation ID, concurrency fixes, LightResource etc).
> > > >2. Timeline Service v2
> > > >3. Opportunistic containers
> > > >4. Federation
> > > >
> > > > We would like to hear a formal list from HDFS & Hadoop (& MapReduce
> if
> > > any)
> > > > and will update the Roadmap wiki accordingly.
> > > >
> > > > Considering our familiarity with the above mentioned YARN features,
> we
> > > > would like to volunteer as the co-RMs for 2.9.0.
> > > >
> > > > We want to keep the timeline at 8-12 weeks to keep the release
> > 

Re: DISCUSS: Hadoop Compatability Guidelines

2017-09-07 Thread Steve Loughran

On 7 Sep 2017, at 19:13, Daniel Templeton 
> wrote:

Good point.  I think it would be valuable to enumerate the policies around the 
versioned state stores.  We have the three you listed. We should probably 
include the HDFS fsimage in that list.  Any others?


S3Guard now stores state in DDB. we do have a version marker & A policy written 
down of how upgrades are done

https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md



I also want to add a section that clarifies when it's OK to change the 
visibility or audience of an API.

Daniel

On 9/5/17 11:04 AM, Arun Suresh wrote:
Thanks for starting this Daniel.

I think we should also add a section for store compatibility (all state
stores including RM, NM, Federation etc.). Essentially an explicit policy
detailing when is it ok to change the major and minor versions and how it
should relate to the hadoop release version.
Thoughts ?

Cheers
-Arun


On Tue, Sep 5, 2017 at 10:38 AM, Daniel Templeton 
>
wrote:

Good idea.  I should have thought of that. :)  Done.

Daniel


On 9/5/17 10:33 AM, Anu Engineer wrote:

Could you please attach the PDFs to the JIRA. I think the mailer is
stripping them off from the mail.

Thanks
Anu





On 9/5/17, 9:44 AM, "Daniel Templeton" 
> wrote:

Resending with a broader audience, and reattaching the PDFs.
Daniel

On 9/4/17 9:01 AM, Daniel Templeton wrote:

All, in prep for Hadoop 3 beta 1 I've been working on updating the
compatibility guidelines on HADOOP-13714.  I think the initial doc is
more or less complete, so I'd like to open the discussion up to the
broader Hadoop community.

In the new guidelines, I have drawn some lines in the sand regarding
compatibility between releases.  In some cases these lines are more
restrictive than the current practices.  The intent with the new
guidelines is not to limit progress by restricting what goes into a
release, but rather to drive release numbering to keep in line with
the reality of the code.

Please have a read and provide feedback on the JIRA.  I'm sure there
are more than a couple of areas that could be improved.  If you'd
rather not read markdown from a diff patch, I've attached PDFs of the
two modified docs.

Thanks!
Daniel


-
To unsubscribe, e-mail: 
yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 
yarn-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: 
yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 
yarn-dev-h...@hadoop.apache.org




Re: DISCUSS: Hadoop Compatability Guidelines

2017-09-07 Thread Daniel Templeton
Good point.  I think it would be valuable to enumerate the policies 
around the versioned state stores.  We have the three you listed. We 
should probably include the HDFS fsimage in that list.  Any others?


I also want to add a section that clarifies when it's OK to change the 
visibility or audience of an API.


Daniel

On 9/5/17 11:04 AM, Arun Suresh wrote:

Thanks for starting this Daniel.

I think we should also add a section for store compatibility (all state
stores including RM, NM, Federation etc.). Essentially an explicit policy
detailing when is it ok to change the major and minor versions and how it
should relate to the hadoop release version.
Thoughts ?

Cheers
-Arun


On Tue, Sep 5, 2017 at 10:38 AM, Daniel Templeton 
wrote:


Good idea.  I should have thought of that. :)  Done.

Daniel


On 9/5/17 10:33 AM, Anu Engineer wrote:


Could you please attach the PDFs to the JIRA. I think the mailer is
stripping them off from the mail.

Thanks
Anu





On 9/5/17, 9:44 AM, "Daniel Templeton"  wrote:

Resending with a broader audience, and reattaching the PDFs.

Daniel

On 9/4/17 9:01 AM, Daniel Templeton wrote:


All, in prep for Hadoop 3 beta 1 I've been working on updating the
compatibility guidelines on HADOOP-13714.  I think the initial doc is
more or less complete, so I'd like to open the discussion up to the
broader Hadoop community.

In the new guidelines, I have drawn some lines in the sand regarding
compatibility between releases.  In some cases these lines are more
restrictive than the current practices.  The intent with the new
guidelines is not to limit progress by restricting what goes into a
release, but rather to drive release numbering to keep in line with
the reality of the code.

Please have a read and provide feedback on the JIRA.  I'm sure there
are more than a couple of areas that could be improved.  If you'd
rather not read markdown from a diff patch, I've attached PDFs of the
two modified docs.

Thanks!
Daniel




-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org





-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-09-07 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/516/

[Sep 6, 2017 6:51:51 AM] (cdouglas) HADOOP-12077. Provide a multi-URI 
replication Inode for ViewFs.
[Sep 6, 2017 8:19:34 PM] (junping_du) YARN-7148. TestLogsCLI fails in trunk and 
branch-2 and javadoc error.
[Sep 6, 2017 8:23:49 PM] (jlowe) YARN-7164. TestAMRMClientOnRMRestart fails 
sporadically with bind
[Sep 6, 2017 9:04:30 PM] (jlowe) HADOOP-14827. Allow StopWatch to accept a 
Timer parameter for tests.
[Sep 6, 2017 9:53:31 PM] (junping_du) YARN-7144. Log Aggregation controller 
should not swallow the exceptions
[Sep 6, 2017 11:39:23 PM] (subru) Revert "Plan/ResourceAllocation data 
structure enhancements required to
[Sep 6, 2017 11:46:01 PM] (subru) YARN-5328. Plan/ResourceAllocation data 
structure enhancements required




-1 overall


The following subsystems voted -1:
findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
   Hard coded reference to an absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:[line 490] 

Failed junit tests :

   hadoop.crypto.key.kms.server.TestKMS 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 
   hadoop.hdfs.TestLeaseRecoveryStriped 
   hadoop.hdfs.TestDFSOutputStream 
   hadoop.hdfs.TestSafeModeWithStripedFile 
   hadoop.hdfs.TestClientProtocolForPipelineRecovery 
   hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 
   hadoop.hdfs.TestReadStripedFileWithMissingBlocks 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure040 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 
   hadoop.hdfs.TestEncryptedTransfer 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure020 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 
   
hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService
 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation 
   hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation 
   hadoop.mapreduce.v2.hs.webapp.TestHSWebApp 
   hadoop.yarn.sls.TestReservationSystemInvariants 
   hadoop.yarn.sls.TestSLSRunner 

Timed out junit tests :

   org.apache.hadoop.hdfs.TestWriteReadStripedFile 
   
org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManager 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/516/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/516/artifact/out/diff-compile-javac-root.txt
  [292K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/516/artifact/out/diff-checkstyle-root.txt
  [17M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/516/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/516/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/516/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/516/artifact/out/whitespace-eol.txt
  [11M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/516/artifact/out/whitespace-tabs.txt
  [1.2M]

   findbugs:

   

[jira] [Created] (MAPREDUCE-6952) Using DistributedCache.addFileToClasspath with a rename fragment fails during job submit

2017-09-07 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6952:
-

 Summary: Using DistributedCache.addFileToClasspath with a rename 
fragment fails during job submit
 Key: MAPREDUCE-6952
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6952
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.8.1, 2.7.4
Reporter: Jason Lowe


Calling DistributedCache.addFileToClasspath with a Path that specifies a URI 
fragment, used to rename the file during localization, causes job submission to 
fail with a FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6951) Jobs fails when mapreduce.jobhistory.webapp.address is in wrong format

2017-09-07 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created MAPREDUCE-6951:


 Summary: Jobs fails when mapreduce.jobhistory.webapp.address is in 
wrong format
 Key: MAPREDUCE-6951
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6951
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.7.3
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


MapReduce jobs fails with below exception when 
mapreduce.jobhistory.webapp.address is in wrong format instead of host:port, 
example user has set to 19888

{code}
java.util.NoSuchElementException 
at com.google.common.base.AbstractIterator.next(AbstractIterator.java:75) 
at 
org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.getApplicationWebURLOnJHSWithoutScheme(MRWebAppUtil.java:130)
 
at 
org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.getApplicationWebURLOnJHSWithScheme(MRWebAppUtil.java:156)
 
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.doUnregistration(RMCommunicator.java:218)
 
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.unregister(RMCommunicator.java:188)
 
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStop(RMCommunicator.java:268)
 
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStop(RMContainerAllocator.java:297)
 
at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) 
at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) 
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStop(MRAppMaster.java:888)
 
at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) 
at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) 
at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
 
at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) 
at 
org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
 
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1667)
 
at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) 
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1168) 
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:603)
 
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:651)
{code}






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org