Re: [VOTE] Merge feature branch YARN-5355 (Timeline Service v2) to trunk

2017-08-23 Thread Vrushali Channapattan
Hi Naga

Thank you for the kind words and appreciate your vote very much. Yes, among
everyone who has worked on contributing to the TSv2 code, there has been a
strong sense of camaraderie.

Let me try to address your questions one by one.

bq. Though i have not tested, just went through the jiras and features
added.
Thank you. Testing for TSv2 as such has been done at Twitter, Hortonworks,
Huawei and Cloudera.

bq. whether there was any other key out standing features left after phase
2 and their impacts ?

Yes, we have some features that we would like to build into TSv2 and we
have included those as part of "Current Status and Future Plans" in the
documentation. Some of the future plans include:
- Clustering of the readers
- Migration and compatibility with v.1
- Timeline collectors as separate instances from node managers
- Support for off-cluster clients
- More robust storage fault tolerance
- Better support for long-running apps
- Support for ACLs
- Offline (time-based periodic) aggregation for flows, users, and queues
for reporting and analysis

bq.  it would interesting to share the plans (if available) of
upstream projects
like Tez, Spark to incorporate ATSv2.

We've had an initial discussion with some of the Tez community members a
few months back. Folks from Twitter, Yahoo!, Hortonworks and Huawei were
part of the discussion. It was a productive session and we came away with a
set of things to do right away, of which we have already added in features
like support for sub-application entities and simple whitelisting of users
so that Tez community can start experimenting with it. I believe some of
the Tez Yahoo! folks are in the very early stages of exploring TSv2.
Rohith Sharma from Hortonworks has done some more groundwork and has some
advanced thoughts around the TSv2-Tez integration considerations.

We also have had an initial discussion with the Federation team (Subru,
Carlo) a while back. Accordingly, we have already updated some aspects of
our schema to accommodate things like the application to cluster mappings.
We have an open jira for Federation integration (YARN-5357) which is not
included in this version.

We have not yet reached out to Spark folks but would love to do so.

thanks
Vrushali


On Wed, Aug 23, 2017 at 5:49 PM, Naganarasimha Garla <
naganarasimha...@apache.org> wrote:

> +1 (Binding), Great work guys, unwavering dedication and team effort. Kudos
> to the whole team.
> Though i have not tested, just went through the jiras and features added.
>
> Just wanted to know whether there was any other key out standing features
> left after phase 2 and their impacts ?
> And it would interesting to share the plans (if available) of upstream
> projects like Tez, Spark to incorporate ATSv2.
>
> Regards,
> + Naga
>
>
>
> On Thu, Aug 24, 2017 at 12:28 AM, varunsax...@apache.org <
> varun.saxena.apa...@gmail.com> wrote:
>
> > Hi All,
> >
> > Just to update.
> > Folks who may be interested in going through the documentation for the
> > feature can refer to current documentation attached in pdf format, on the
> > umbrella JIRA i.e. YARN-5355.
> >
> > Furthermore, we have run the YARN-5355 patch(with all the changes made in
> > the branch) against trunk and the build is almost green barring a few
> > checkstyle issues. The test failures which come up in the build are
> > outstanding issues on trunk.
> > Refer to https://issues.apache.org/jira/browse/YARN-5355?focusedCo
> > mmentId=16138266=com.atlassian.jira.plugin.system.
> > issuetabpanels:comment-tabpanel#comment-16138266
> >
> > Thanks,
> > Varun Saxena.
> >
> > On Tue, Aug 22, 2017 at 12:02 PM, Vrushali Channapattan <
> > vrushalic2...@gmail.com> wrote:
> >
> > > Hi folks,
> > >
> > > Per earlier discussion [1], I'd like to start a formal vote to merge
> > > feature branch YARN-5355 [2] (Timeline Service v.2) to trunk. The vote
> > will
> > > run for 7 days, and will end August 29 11:00 PM PDT.
> > >
> > > We have previously completed one merge onto trunk [3] and Timeline
> > Service
> > > v2 has been part of Hadoop release 3.0.0-alpha1.
> > >
> > > Since then, we have been working on extending the capabilities of
> > Timeline
> > > Service v2 in a feature branch [2] for a while, and we are reasonably
> > > confident that the state of the feature meets the criteria to be merged
> > > onto trunk and we'd love folks to get their hands on it in a test
> > capacity
> > > and provide valuable feedback so that we can make it production-ready.
> > >
> > > In a nutshell, Timeline Service v.2 delivers significant scalability
> and
> > > usability improvements based on a new architecture. What we would like
> to
> > > merge to trunk is termed "alpha 2" (milestone 2). The feature has a
> > > complete end-to-end read/write flow with security and read level
> > > authorization via whitelists. You should be able to start setting it up
> > and
> > > testing it.
> > >
> > > At a high level, the following are the key features that have been
> > > 

Re: [VOTE] Merge feature branch YARN-5355 (Timeline Service v2) to trunk

2017-08-23 Thread Naganarasimha Garla
+1 (Binding), Great work guys, unwavering dedication and team effort. Kudos
to the whole team.
Though i have not tested, just went through the jiras and features added.

Just wanted to know whether there was any other key out standing features
left after phase 2 and their impacts ?
And it would interesting to share the plans (if available) of upstream
projects like Tez, Spark to incorporate ATSv2.

Regards,
+ Naga



On Thu, Aug 24, 2017 at 12:28 AM, varunsax...@apache.org <
varun.saxena.apa...@gmail.com> wrote:

> Hi All,
>
> Just to update.
> Folks who may be interested in going through the documentation for the
> feature can refer to current documentation attached in pdf format, on the
> umbrella JIRA i.e. YARN-5355.
>
> Furthermore, we have run the YARN-5355 patch(with all the changes made in
> the branch) against trunk and the build is almost green barring a few
> checkstyle issues. The test failures which come up in the build are
> outstanding issues on trunk.
> Refer to https://issues.apache.org/jira/browse/YARN-5355?focusedCo
> mmentId=16138266=com.atlassian.jira.plugin.system.
> issuetabpanels:comment-tabpanel#comment-16138266
>
> Thanks,
> Varun Saxena.
>
> On Tue, Aug 22, 2017 at 12:02 PM, Vrushali Channapattan <
> vrushalic2...@gmail.com> wrote:
>
> > Hi folks,
> >
> > Per earlier discussion [1], I'd like to start a formal vote to merge
> > feature branch YARN-5355 [2] (Timeline Service v.2) to trunk. The vote
> will
> > run for 7 days, and will end August 29 11:00 PM PDT.
> >
> > We have previously completed one merge onto trunk [3] and Timeline
> Service
> > v2 has been part of Hadoop release 3.0.0-alpha1.
> >
> > Since then, we have been working on extending the capabilities of
> Timeline
> > Service v2 in a feature branch [2] for a while, and we are reasonably
> > confident that the state of the feature meets the criteria to be merged
> > onto trunk and we'd love folks to get their hands on it in a test
> capacity
> > and provide valuable feedback so that we can make it production-ready.
> >
> > In a nutshell, Timeline Service v.2 delivers significant scalability and
> > usability improvements based on a new architecture. What we would like to
> > merge to trunk is termed "alpha 2" (milestone 2). The feature has a
> > complete end-to-end read/write flow with security and read level
> > authorization via whitelists. You should be able to start setting it up
> and
> > testing it.
> >
> > At a high level, the following are the key features that have been
> > implemented since alpha1:
> > - Security via Kerberos Authentication and delegation tokens
> > - Read side simple authorization via whitelist
> > - Client configurable entity sort ordering
> > - Richer REST APIs for apps, app attempts, containers, fetching metrics
> by
> > timerange, pagination, sub-app entities
> > - Support for storing sub-application entities (entities that exist
> outside
> > the scope of an application)
> > - Configurable TTLs (time-to-live) for tables, configurable table
> prefixes,
> > configurable hbase cluster
> > - Flow level aggregations done as dynamic (table level) coprocessors
> > - Uses latest stable HBase release 1.2.6
> >
> > There are a total of 82 subtasks that were completed as part of this
> > effort.
> >
> > We paid close attention to ensure that once disabled Timeline Service v.2
> > does not impact existing functionality when disabled (by default).
> >
> > Special thanks to a team of folks who worked hard and contributed towards
> > this effort with patches, reviews and guidance: Rohith Sharma K S, Varun
> > Saxena, Haibo Chen, Sangjin Lee, Li Lu, Vinod Kumar Vavilapalli, Joep
> > Rottinghuis, Jason Lowe, Jian He, Robert Kanter, Micheal Stack.
> >
> > Regards,
> > Vrushali
> >
> > [1] http://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg27383.html
> > [2] https://issues.apache.org/jira/browse/YARN-5355
> > [3] https://issues.apache.org/jira/browse/YARN-2928
> > [4] https://github.com/apache/hadoop/commits/YARN-5355
> >
>


Re: Branch merges and 3.0.0-beta1 scope

2017-08-23 Thread Ravi Prakash
Also, when people +1 a merge, can they please describe if they did testing
/ use the feature in addition to what is already described in the thread?

On Wed, Aug 23, 2017 at 11:18 AM, Vrushali Channapattan <
vrushalic2...@gmail.com> wrote:

> For timeline service v2, we have completed all subtasks under YARN-5355
> [1].
>
> We initiated a merge-to-trunk vote [2] yesterday.
>
> thanks
> Vrushali
> [1] https://issues.apache.org/jira/browse/YARN-5355
> [2]
> http://mail-archives.apache.org/mod_mbox/hadoop-common-
> dev/201708.mbox/%3CCAE=b_fbLT2J+Ezb4wqdN_UwBiG1Sd5kpqGaw+9Br__zou5yNTQ@
> mail.gmail.com%3E
>
>
> On Wed, Aug 23, 2017 at 11:12 AM, Vinod Kumar Vavilapalli <
> vino...@apache.org> wrote:
>
> > Agreed. I was very clearly not advocating for rushing in features. If you
> > have followed my past emails, I have only strongly advocated features be
> > worked in branches and get merged when they are in a reasonable state.
> >
> > Each branch contributor group should look at their readiness and merge
> > stuff in assuming that the branch reached a satisfactory state. That’s
> it.
> >
> > From release management perspective, blocking features just because we
> are
> > a month close to the deadline is not reasonable. Let the branch
> > contributors rationalize, make this decision and the rest of us can
> support
> > them in making the decision.
> >
> > +Vinod
> >
> > > At this point, there have been three planned alphas from September 2016
> > until July 2017 to "get in features".  While a couple of upcoming
> features
> > are "a few weeks" away, I think all of us are aware how predictable
> > software development schedules can be.  I think we can also all agree
> that
> > rushing just to meet a release deadline isn't the best practice when it
> > comes to software development either.
> > >
> > > Andrew has been very clear about his goals at each step and I think
> > Wangda's willingness to not rush in resource types was an appropriate
> > response.  I'm sympathetic to the goals of getting in a feature for 3.0,
> > but it might be a good idea for each project that is a "few weeks away"
> to
> > seriously look at the readiness compared to the features which have been
> > testing for 6+ months already.
> > >
> > > -Ray
> >
> >
>


[jira] [Created] (HADOOP-14803) Upgrade JUnit 3 TestCase to JUnit 4 in TestS3NInMemoryFileSystem

2017-08-23 Thread Ajay Kumar (JIRA)
Ajay Kumar created HADOOP-14803:
---

 Summary: Upgrade JUnit 3 TestCase to JUnit 4 in 
TestS3NInMemoryFileSystem
 Key: HADOOP-14803
 URL: https://issues.apache.org/jira/browse/HADOOP-14803
 Project: Hadoop Common
  Issue Type: Test
Reporter: Ajay Kumar






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Merge feature branch YARN-5355 (Timeline Service v2) to trunk

2017-08-23 Thread varunsax...@apache.org
Hi All,

Just to update.
Folks who may be interested in going through the documentation for the
feature can refer to current documentation attached in pdf format, on the
umbrella JIRA i.e. YARN-5355.

Furthermore, we have run the YARN-5355 patch(with all the changes made in
the branch) against trunk and the build is almost green barring a few
checkstyle issues. The test failures which come up in the build are
outstanding issues on trunk.
Refer to https://issues.apache.org/jira/browse/YARN-5355?focusedCo
mmentId=16138266=com.atlassian.jira.plugin.system.
issuetabpanels:comment-tabpanel#comment-16138266

Thanks,
Varun Saxena.

On Tue, Aug 22, 2017 at 12:02 PM, Vrushali Channapattan <
vrushalic2...@gmail.com> wrote:

> Hi folks,
>
> Per earlier discussion [1], I'd like to start a formal vote to merge
> feature branch YARN-5355 [2] (Timeline Service v.2) to trunk. The vote will
> run for 7 days, and will end August 29 11:00 PM PDT.
>
> We have previously completed one merge onto trunk [3] and Timeline Service
> v2 has been part of Hadoop release 3.0.0-alpha1.
>
> Since then, we have been working on extending the capabilities of Timeline
> Service v2 in a feature branch [2] for a while, and we are reasonably
> confident that the state of the feature meets the criteria to be merged
> onto trunk and we'd love folks to get their hands on it in a test capacity
> and provide valuable feedback so that we can make it production-ready.
>
> In a nutshell, Timeline Service v.2 delivers significant scalability and
> usability improvements based on a new architecture. What we would like to
> merge to trunk is termed "alpha 2" (milestone 2). The feature has a
> complete end-to-end read/write flow with security and read level
> authorization via whitelists. You should be able to start setting it up and
> testing it.
>
> At a high level, the following are the key features that have been
> implemented since alpha1:
> - Security via Kerberos Authentication and delegation tokens
> - Read side simple authorization via whitelist
> - Client configurable entity sort ordering
> - Richer REST APIs for apps, app attempts, containers, fetching metrics by
> timerange, pagination, sub-app entities
> - Support for storing sub-application entities (entities that exist outside
> the scope of an application)
> - Configurable TTLs (time-to-live) for tables, configurable table prefixes,
> configurable hbase cluster
> - Flow level aggregations done as dynamic (table level) coprocessors
> - Uses latest stable HBase release 1.2.6
>
> There are a total of 82 subtasks that were completed as part of this
> effort.
>
> We paid close attention to ensure that once disabled Timeline Service v.2
> does not impact existing functionality when disabled (by default).
>
> Special thanks to a team of folks who worked hard and contributed towards
> this effort with patches, reviews and guidance: Rohith Sharma K S, Varun
> Saxena, Haibo Chen, Sangjin Lee, Li Lu, Vinod Kumar Vavilapalli, Joep
> Rottinghuis, Jason Lowe, Jian He, Robert Kanter, Micheal Stack.
>
> Regards,
> Vrushali
>
> [1] http://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg27383.html
> [2] https://issues.apache.org/jira/browse/YARN-5355
> [3] https://issues.apache.org/jira/browse/YARN-2928
> [4] https://github.com/apache/hadoop/commits/YARN-5355
>


Re: [VOTE] Merge HADOOP-13345 (S3Guard feature branch)

2017-08-23 Thread sanjay Radia

+1 (binding)
Thanks  community for all the hard that went into this critical piece of work.


sanjay
> 
> 
> On 22 Aug 2017, at 11:17, Steve Loughran 
> > wrote:
> 
> +1 (binding)
> 
> I'm happy with it; it's a great piece of work by (in no particular order): 
> Chris Nauroth, Aaron Fabbri, Sean McRory & Mingliang Liu. plus a few bits in 
> the corners where I got to break things while they were all asleep. Also 
> deserving a mention: Thomas Demoor & Ewan Higgs @ WDC for consultancy on the 
> corners of S3, everyone who tested in (including our QA team), Sanjay Radia, 
> & others.
> 
> I've already done a couple of iterations of fixing checksyles & code reviews, 
> so I think it is ready. I also have a branch-2 patch based on earlier work by 
> Mingliang, for people who want that.
> 
> 
> 
> 
> On 17 Aug 2017, at 23:07, Aaron Fabbri 
> > wrote:
> 
> Hello,
> 
> I'd like to open a vote (7 days, ending August 24 at 3:10 PST) to merge the
> HADOOP-13345 feature branch into trunk.
> 
> This branch contains the new S3Guard feature which adds metadata
> consistency features to the S3A client.  Formatted site documentation can
> be found here:
> 
> https://github.com/apache/hadoop/blob/HADOOP-13345/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
> 
> The current patch against trunk is posted here:
> 
> https://issues.apache.org/jira/browse/HADOOP-13998
> 
> The branch modifies the s3a portion of the hadoop-tools/hadoop-aws module:
> 
> - The feature is off by default, and care has been taken to insure it has
> no impact when disabled.
> - S3Guard can be enabled with the production database which is backed by
> DynamoDB, or with a local, in-memory implementation that facilitates
> integration testing without having to pay for a database.
> - getFileStatus() as well as directory listing consistency has been
> implemented and thoroughly tested, including delete tracking.
> - Convenient Maven profiles for testing with and without S3Guard.
> - New failure injection code and integration tests that exercise it.  We
> use timers and a wrapper around the Amazon SDK client object to force
> consistency delays to occur.  This allows us to assert that S3Guard works
> as advertised.  This will be extended with more types of failure injection
> to continue hardening the S3A client.
> 
> Outside of hadoop-tools/hadoop-aws's s3a directory there are some minor
> changes:
> 
> - core-default.xml defaults and documentation for s3guard parameters.
> - A couple additional FS contract test cases around rename.
> - More goodies in LambdaTestUtils
> - A new CLI tool for inspecting and manipulating S3Guard features,
> including the backing MetadataStore database.
> 
> This branch has seen extensive testing as well as use in production.  This
> branch makes significant improvements to S3A's test toolkit as well.
> 
> Performance is typically on par with, and in some cases better than, the
> existing S3A code without S3Guard enabled.
> 
> This feature was developed with contributions and feedback from many
> people.  I'd like to thank everyone who worked on HADOOP-13345 as well as
> all of those who contributed feedback and work on the original design
> document.
> 
> This is the first major Apache Hadoop project I've worked on from start to
> finish, and I've really enjoyed it.  Please shout if I've missed anything
> important here or in the VOTE process.
> 
> Cheers,
> Aaron Fabbri
> 
> 
> -
> To unsubscribe, e-mail: 
> common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: 
> common-dev-h...@hadoop.apache.org
> 
> 


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[VOTE] Merge YARN-3926 (resource profile) to trunk

2017-08-23 Thread Wangda Tan
 Hi folks,

Per earlier discussion [1], I'd like to start a formal vote to merge
feature branch YARN-3926 (Resource profile) to trunk. The vote will run for
7 days and will end August 30 10:00 AM PDT.

Briefly, YARN-3926 can extend resource model of YARN to support resource
types other than CPU and memory, so it will be a cornerstone of features
like GPU support (YARN-6223), disk scheduling/isolation (YARN-2139), FPGA
support (YARN-5983), network IO scheduling/isolation (YARN-2140). In
addition to that, YARN-3926 allows admin to preconfigure resource profiles
in the cluster, for example, m3.large means <2 vcores, 8 GB memory, 64 GB
disk>, so applications can request "m3.large" profile instead of specifying
all resource types’s values.

There are 32 subtasks that were completed as part of this effort.

This feature needs to be explicitly turned on before use. We paid close
attention to compatibility, performance, and scalability of this feature,
mentioned in [1], we didn't see observable performance regression in large
scale SLS (scheduler load simulator) executions and saw less than 5%
performance regression by using micro benchmark added by YARN-6775.

This feature works from end-to-end (including UI/CLI/application/server),
we have setup a cluster with this feature turned on runs for several weeks,
we didn't see any issues by far.

Merge JIRA: YARN-7013 (Jenkins gave +1 already).
Documentation: YARN-7056

Special thanks to a team of folks who worked hard and contributed towards
this effort including design discussion/development/reviews, etc.: Varun
Vasudev, Sunil Govind, Daniel Templeton, Vinod Vavilapalli, Yufei Gu,
Karthik Kambatla, Jason Lowe, Arun Suresh.

Regards,
Wangda Tan

[1]
http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201708.mbox/%3CCAD%2B%2BeCnjEHU%3D-M33QdjnND0ZL73eKwxRua4%3DBbp4G8inQZmaMg%40mail.gmail.com%3E


Re: [VOTE] Merge feature branch YARN-5355 (Timeline Service v2) to trunk

2017-08-23 Thread J. Rottinghuis
+1 (non-binding) for the merge

@Vinod I hope that means a +1 from you as well!

Cheers,

Joep

On Tue, Aug 22, 2017 at 11:15 AM, Vinod Kumar Vavilapalli <
vino...@apache.org> wrote:

> Such a great community effort - hats off, team!
>
> Thanks
> +Vinod
>
> > On Aug 21, 2017, at 11:32 PM, Vrushali Channapattan <
> vrushalic2...@gmail.com> wrote:
> >
> > Hi folks,
> >
> > Per earlier discussion [1], I'd like to start a formal vote to merge
> > feature branch YARN-5355 [2] (Timeline Service v.2) to trunk. The vote
> will
> > run for 7 days, and will end August 29 11:00 PM PDT.
> >
> > We have previously completed one merge onto trunk [3] and Timeline
> Service
> > v2 has been part of Hadoop release 3.0.0-alpha1.
> >
> > Since then, we have been working on extending the capabilities of
> Timeline
> > Service v2 in a feature branch [2] for a while, and we are reasonably
> > confident that the state of the feature meets the criteria to be merged
> > onto trunk and we'd love folks to get their hands on it in a test
> capacity
> > and provide valuable feedback so that we can make it production-ready.
> >
> > In a nutshell, Timeline Service v.2 delivers significant scalability and
> > usability improvements based on a new architecture. What we would like to
> > merge to trunk is termed "alpha 2" (milestone 2). The feature has a
> > complete end-to-end read/write flow with security and read level
> > authorization via whitelists. You should be able to start setting it up
> and
> > testing it.
> >
> > At a high level, the following are the key features that have been
> > implemented since alpha1:
> > - Security via Kerberos Authentication and delegation tokens
> > - Read side simple authorization via whitelist
> > - Client configurable entity sort ordering
> > - Richer REST APIs for apps, app attempts, containers, fetching metrics
> by
> > timerange, pagination, sub-app entities
> > - Support for storing sub-application entities (entities that exist
> outside
> > the scope of an application)
> > - Configurable TTLs (time-to-live) for tables, configurable table
> prefixes,
> > configurable hbase cluster
> > - Flow level aggregations done as dynamic (table level) coprocessors
> > - Uses latest stable HBase release 1.2.6
> >
> > There are a total of 82 subtasks that were completed as part of this
> effort.
> >
> > We paid close attention to ensure that once disabled Timeline Service v.2
> > does not impact existing functionality when disabled (by default).
> >
> > Special thanks to a team of folks who worked hard and contributed towards
> > this effort with patches, reviews and guidance: Rohith Sharma K S, Varun
> > Saxena, Haibo Chen, Sangjin Lee, Li Lu, Vinod Kumar Vavilapalli, Joep
> > Rottinghuis, Jason Lowe, Jian He, Robert Kanter, Micheal Stack.
> >
> > Regards,
> > Vrushali
> >
> > [1] http://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg27383.html
> > [2] https://issues.apache.org/jira/browse/YARN-5355
> > [3] https://issues.apache.org/jira/browse/YARN-2928
> > [4] https://github.com/apache/hadoop/commits/YARN-5355
>
>
> -
> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Merge HADOOP-13345 (S3Guard feature branch)

2017-08-23 Thread Aaron Fabbri
On Tue, Aug 22, 2017 at 10:24 AM, Steve Loughran 
wrote:

> video being processed:  https://www.youtube.com/watch?
> v=oIe5Zl2YsLE=youtu.be
>
>
Awesome demo Steve, thanks for doing this.  Particularly glad to see folks
using and extending the failure injection client.  Demoing the CLI tool was
great as well.

Wanted to mention two things:

1. Authoritative mode is not fully implemented yet with Dynamo (it needs to
persist an extra bit for directories).  I do have an auth-mode patch (done
for a hackathon) that I need to post which shows large performance
improvements over what S3Guard has today.  As you said, we don't consider
authoritative mode ready for production yet: we want to play with it more
and improve the prune algorithm first.  Authoritative mode can be thought
of as a nice bonus in the future: The main goal of S3Guard v1 is to fix the
get / list consistency issues you mentioned, which it does well.

2. Also wanted to thank Lei (Eddy) Xu, he was very active during early
design and contributed some patches as well.

Again, great demo, enjoyed it!

-AF



> its actually quite hard to show any benefits of s3guard on the command
> line, so I've ended up showing some scala tests where I turn on the
> (bundled) inconsistent AWS client to show how you then need to enable
> s3guard to make the stack traces go away
>
>
> On 22 Aug 2017, at 11:17, Steve Loughran > wrote:
>
> +1 (binding)
>
> I'm happy with it; it's a great piece of work by (in no particular order):
> Chris Nauroth, Aaron Fabbri, Sean McRory & Mingliang Liu. plus a few bits
> in the corners where I got to break things while they were all asleep. Also
> deserving a mention: Thomas Demoor & Ewan Higgs @ WDC for consultancy on
> the corners of S3, everyone who tested in (including our QA team), Sanjay
> Radia, & others.
>
> I've already done a couple of iterations of fixing checksyles & code
> reviews, so I think it is ready. I also have a branch-2 patch based on
> earlier work by Mingliang, for people who want that.
>
>
>
>
> On 17 Aug 2017, at 23:07, Aaron Fabbri  b...@cloudera.com>> wrote:
>
> Hello,
>
> I'd like to open a vote (7 days, ending August 24 at 3:10 PST) to merge the
> HADOOP-13345 feature branch into trunk.
>
> This branch contains the new S3Guard feature which adds metadata
> consistency features to the S3A client.  Formatted site documentation can
> be found here:
>
> https://github.com/apache/hadoop/blob/HADOOP-13345/
> hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
>
> The current patch against trunk is posted here:
>
> https://issues.apache.org/jira/browse/HADOOP-13998
>
> The branch modifies the s3a portion of the hadoop-tools/hadoop-aws module:
>
> - The feature is off by default, and care has been taken to insure it has
> no impact when disabled.
> - S3Guard can be enabled with the production database which is backed by
> DynamoDB, or with a local, in-memory implementation that facilitates
> integration testing without having to pay for a database.
> - getFileStatus() as well as directory listing consistency has been
> implemented and thoroughly tested, including delete tracking.
> - Convenient Maven profiles for testing with and without S3Guard.
> - New failure injection code and integration tests that exercise it.  We
> use timers and a wrapper around the Amazon SDK client object to force
> consistency delays to occur.  This allows us to assert that S3Guard works
> as advertised.  This will be extended with more types of failure injection
> to continue hardening the S3A client.
>
> Outside of hadoop-tools/hadoop-aws's s3a directory there are some minor
> changes:
>
> - core-default.xml defaults and documentation for s3guard parameters.
> - A couple additional FS contract test cases around rename.
> - More goodies in LambdaTestUtils
> - A new CLI tool for inspecting and manipulating S3Guard features,
> including the backing MetadataStore database.
>
> This branch has seen extensive testing as well as use in production.  This
> branch makes significant improvements to S3A's test toolkit as well.
>
> Performance is typically on par with, and in some cases better than, the
> existing S3A code without S3Guard enabled.
>
> This feature was developed with contributions and feedback from many
> people.  I'd like to thank everyone who worked on HADOOP-13345 as well as
> all of those who contributed feedback and work on the original design
> document.
>
> This is the first major Apache Hadoop project I've worked on from start to
> finish, and I've really enjoyed it.  Please shout if I've missed anything
> important here or in the VOTE process.
>
> Cheers,
> Aaron Fabbri
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org common-dev-unsubscr...@hadoop.apache.org>
> For additional commands, e-mail: 

Re: Branch merges and 3.0.0-beta1 scope

2017-08-23 Thread Vrushali Channapattan
For timeline service v2, we have completed all subtasks under YARN-5355
[1].

We initiated a merge-to-trunk vote [2] yesterday.

thanks
Vrushali
[1] https://issues.apache.org/jira/browse/YARN-5355
[2]
http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201708.mbox/%3CCAE=b_fblt2j+ezb4wqdn_uwbig1sd5kpqgaw+9br__zou5y...@mail.gmail.com%3E


On Wed, Aug 23, 2017 at 11:12 AM, Vinod Kumar Vavilapalli <
vino...@apache.org> wrote:

> Agreed. I was very clearly not advocating for rushing in features. If you
> have followed my past emails, I have only strongly advocated features be
> worked in branches and get merged when they are in a reasonable state.
>
> Each branch contributor group should look at their readiness and merge
> stuff in assuming that the branch reached a satisfactory state. That’s it.
>
> From release management perspective, blocking features just because we are
> a month close to the deadline is not reasonable. Let the branch
> contributors rationalize, make this decision and the rest of us can support
> them in making the decision.
>
> +Vinod
>
> > At this point, there have been three planned alphas from September 2016
> until July 2017 to "get in features".  While a couple of upcoming features
> are "a few weeks" away, I think all of us are aware how predictable
> software development schedules can be.  I think we can also all agree that
> rushing just to meet a release deadline isn't the best practice when it
> comes to software development either.
> >
> > Andrew has been very clear about his goals at each step and I think
> Wangda's willingness to not rush in resource types was an appropriate
> response.  I'm sympathetic to the goals of getting in a feature for 3.0,
> but it might be a good idea for each project that is a "few weeks away" to
> seriously look at the readiness compared to the features which have been
> testing for 6+ months already.
> >
> > -Ray
>
>


Re: Branch merges and 3.0.0-beta1 scope

2017-08-23 Thread Vinod Kumar Vavilapalli
Agreed. I was very clearly not advocating for rushing in features. If you have 
followed my past emails, I have only strongly advocated features be worked in 
branches and get merged when they are in a reasonable state.

Each branch contributor group should look at their readiness and merge stuff in 
assuming that the branch reached a satisfactory state. That’s it.

From release management perspective, blocking features just because we are a 
month close to the deadline is not reasonable. Let the branch contributors 
rationalize, make this decision and the rest of us can support them in making 
the decision.

+Vinod

> At this point, there have been three planned alphas from September 2016 until 
> July 2017 to "get in features".  While a couple of upcoming features are "a 
> few weeks" away, I think all of us are aware how predictable software 
> development schedules can be.  I think we can also all agree that rushing 
> just to meet a release deadline isn't the best practice when it comes to 
> software development either.
> 
> Andrew has been very clear about his goals at each step and I think Wangda's 
> willingness to not rush in resource types was an appropriate response.  I'm 
> sympathetic to the goals of getting in a feature for 3.0, but it might be a 
> good idea for each project that is a "few weeks away" to seriously look at 
> the readiness compared to the features which have been testing for 6+ months 
> already.
> 
> -Ray



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-08-23 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/

[Aug 22, 2017 8:14:12 AM] (aajisaka) YARN-7047. Moving logging APIs over to 
slf4j in
[Aug 22, 2017 10:55:48 AM] (stevel) HADOOP-14787. AliyunOSS: Implement the 
`createNonRecursive` operator.
[Aug 22, 2017 2:47:39 PM] (xiao) HADOOP-14705. Add batched interface 
reencryptEncryptedKeys to KMS.
[Aug 22, 2017 5:56:09 PM] (jlowe) YARN-2416. InvalidStateTransitonException in 
ResourceManager if
[Aug 22, 2017 6:16:24 PM] (jlowe) YARN-7048. Fix tests faking kerberos to 
explicitly set ugi auth type.
[Aug 22, 2017 9:50:01 PM] (jlowe) HADOOP-14687. AuthenticatedURL will reuse 
bad/expired session cookies.
[Aug 23, 2017 2:20:57 AM] (subru) YARN-7053. Move curator transaction support 
to ZKCuratorManager.




-1 overall


The following subsystems voted -1:
findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
   Hard coded reference to an absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:[line 490] 

Failed junit tests :

   hadoop.security.TestRaceWhenRelogin 
   hadoop.net.TestDNS 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation 
   hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation 
   hadoop.yarn.sls.appmaster.TestAMSimulator 
   hadoop.yarn.sls.nodemanager.TestNMSimulator 

Timed out junit tests :

   
org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA 
   
org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/diff-compile-javac-root.txt
  [292K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/diff-checkstyle-root.txt
  [17M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/whitespace-eol.txt
  [11M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/whitespace-tabs.txt
  [1.2M]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/diff-javadoc-javadoc-root.txt
  [1.9M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [148K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [236K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [64K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/501/artifact/out/patch-unit-hadoop-tools_hadoop-sls.txt
  [16K]

Powered by Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14802) Add support for using container saskeys for all accesses

2017-08-23 Thread Sivaguru Sankaridurg (JIRA)
Sivaguru Sankaridurg created HADOOP-14802:
-

 Summary: Add support for using container saskeys for all accesses
 Key: HADOOP-14802
 URL: https://issues.apache.org/jira/browse/HADOOP-14802
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/azure
Reporter: Sivaguru Sankaridurg
Assignee: Sivaguru Sankaridurg


This JIRA tracks adding support for using container saskey for all storage 
access.
Instead of using saskeys that are specific to each blob, it is possible to 
re-use the container saskey for all blob accesses.
This provides a performance improvement over using blob-specific saskeys



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org