Releases table needs to be cleaned up

2018-07-02 Thread Andrew Wang
Hi folks,

https://hadoop.apache.org/releases.html

The table of releases here is supposed to only contain one row per release
line. We've got a lot of old dupes at this point, e.g. 3.0.2, 2.9.0, 2.8.3.
It'd also be good to keep this sorted by version number, so 3.1.0 is at the
top.

Could a recent RM take care of cleaning this up?

Thanks,
Andrew


Re: site doc cleanup

2018-07-02 Thread Andrew Wang
It seems aggressive to delete docs just because a line hasn't had a recent
release. We don't have a formal EOL policy for release lines, and old
releases (particularly old clients) are still used and old docs linked in
places.

Also, sorry if I missed the original rationale, but what do we gain from
deleting old docs?

On Fri, Jun 29, 2018 at 8:24 AM Owen O'Malley 
wrote:

> If propose keeping the last patch release on each X.Y branch and only keep
> the versions that have been being maintained (a patch release in the last
> year?) recently.
>
> Thoughts?
>
> .. Owen
>
> > On Jun 28, 2018, at 19:19, Steve Loughran 
> wrote:
> >
> > Rm'd all of 3.0.0-* ; left the current/stable symlinks alone
> >
> > On 27 Jun 2018, at 21:17, Sean Busbey  bus...@cloudera.com>> wrote:
> >
> >
> > 3.1.0 was labeled "not ready for production" in its release notes[1].
> > Seems that means 3.0.3 is the stable3 release?
> >
> > Speaking with my HBase hat on I'd rather "current" from the sitemap
> > point at a version folks could reasonably expect HBase to run on top
> > of. Unfortunately, I think that would likely be 2.9.1 due to ongoing
> > issues[2].
> >
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


Re: HADOOP-15124 review

2018-06-26 Thread Andrew Wang
I think it's fine to ask for review help on the dev list, sometimes JIRAs
are missed or get stuck. It also helps sometimes to git blame the files
you're touching and see who is a likely reviewer, and then pinging them on
JIRA to ask for a review.

Igor, hopefully a committer takes interest in your JIRA and helps you get
it integrated.

Best,
Andrew

On Tue, Jun 26, 2018 at 9:27 PM Igor Dvorzhak 
wrote:

> Hi Yiqun,
>
> Thank you for the explanation. I didn't know that this is not appropriate
> and will not do so in future.
>
> Thanks,
> Igor
>
>
> On Tue, Jun 26, 2018 at 7:18 PM Lin,Yiqun(vip.com) <
> yiqun01@vipshop.com> wrote:
>
>> Hi Igor,
>>
>> It’s not appropriate to ask for a review request in dev mailing list. Dev
>> mailing list is mainly used for discussing and answering user’s questions.
>> You can ask for the review under specific JIRA, that will be seen by
>> committers or others. If they have time, they will help take the review.
>>
>> Yiqun
>> Thanks
>>
>> 发件人: Igor Dvorzhak [mailto:i...@google.com.INVALID]
>> 发送时间: 2018年6月26日 23:52
>> 收件人: hdfs-...@hadoop.apache.org; common-dev@hadoop.apache.org
>> 主题: Re: HADOOP-15124 review
>>
>> +common-dev@hadoop.apache.org
>>
>> On Tue, Jun 26, 2018 at 8:49 AM Igor Dvorzhak > i...@google.com>> wrote:
>> Hello,
>>
>> I have a patch that
>> improves FileSystem.Statistics implementation and I would like to commit it.
>>
>> May somebody review it?
>>
>> Best regards,
>> Igor Dvorzhak
>> 本电子邮件可能为保密文件。如果阁下非电子邮件所指定之收件人,谨请立即通知本人。敬请阁下不要使用、保存、复印、打印、散布本电子邮件及其内容,或将其用于其他任何目的或向任何人披露。谢谢您的合作!
>> This communication is intended only for the addressee(s) and may contain
>> information that is privileged and confidential. You are hereby notified
>> that, if you are not an intended recipient listed above, or an authorized
>> employee or agent of an addressee of this communication responsible for
>> delivering e-mail messages to an intended recipient, any dissemination,
>> distribution or reproduction of this communication (including any
>> attachments hereto) is strictly prohibited. If you have received this
>> communication in error, please notify us immediately by a reply e-mail
>> addressed to the sender and permanently delete the original e-mail
>> communication and any attachments from all storage devices without making
>> or otherwise retaining a copy.
>>
>


Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-05 Thread Andrew Wang
Hi Sanjay, thanks for the response, replying inline:

- NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen
> acknowledge the benefit of the new block layer).  We have two choices here
>  ** a) Evolve NN so that it can interact with both old and new block layer,
>  **  b) Fork and create new NN that works only with new block layer, the
> old NN will continue to work with old block layer.
> There are trade-offs but clearly the 2nd option has least impact on the
> old HDFS code.
>
> Are you proposing that we pursue the 2nd option to integrate HDSL with
HDFS?


> - Share the HDSL’s netty  protocol engine with HDFS block layer.  After
> HDSL and Ozone has stabilized the engine, put the new netty engine in
> either HDFS or in Hadoop common - HDSL will use it from there. The HDFS
> community  has been talking about moving to better thread model for HDFS
> DNs since release 0.16!!
>
> The Netty-based protocol engine seems like it could be contributed
separately from HDSL. I'd be interested to learn more about the performance
and other improvements from this new engine.


> - Shallow copy. Here HDSL needs a way to get the actual linux file system
> links - HDFS block layer needs  to provide a private secure API to get file
> names of blocks so that HDSL can do a hard link (hence shallow copy)o
>

Why isn't this possible with two processes? SCR for instance securely
passes file descriptors between the DN and client over a unix domain
socket. I'm sure we can construct a protocol that securely and efficiently
creates hardlinks.

It also sounds like this shallow copy won't work with features like HDFS
encryption or erasure coding, which diminishes its utility. We also don't
even have HDFS-to-HDFS shallow copy yet, so HDFS-to-Ozone shallow copy is
even further out.

Best,
Andrew


Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-05 Thread Andrew Wang
Hi Owen, Wangda,

Thanks for clearly laying out the subproject options, that helps the
discussion.

I'm all onboard with the idea of regular releases, and it's something I
tried to do with the 3.0 alphas and betas. The problem though isn't a lack
of commitment from feature developers like Sanjay or Jitendra; far from it!
I think every feature developer makes a reasonable effort to test their
code before it's merged. Yet, my experience as an RM is that more code
comes with more risk. I don't believe that Ozone is special or different in
this regard. It comes with a maintenance cost, not a maintenance benefit.

I'm advocating for #3: separate source, separate release. Since HDSL
stability and FSN/BM refactoring are still a ways out, I don't want to
incur a maintenance cost now. I sympathize with the sentiment that working
cross-repo is harder than within same repo, but the right tooling can make
this a lot easier (e.g. git submodule, Google's repo tool). We have
experience doing this internally here at Cloudera, and I'm happy to share
knowledge and possibly code.

Best,
Andrew

On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wheele...@gmail.com> wrote:

> I like the idea of same source / same release and put Ozone's source under
> a different directory.
>
> Like Owen mentioned, It gonna be important for all parties to keep a
> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
> releases. Users can try features and give feedbacks to stabilize feature
> earlier; developers can be happier since efforts will be consumed by users
> soon after features get merged. In addition to this, if features merged to
> trunk after reasonable tests/review, Andrew's concern may not be a problem
> anymore:
>
> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements.
>
> Thanks,
> Wangda
>
>
> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <owen.omal...@gmail.com>
> wrote:
>
>> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <andrew.w...@cloudera.com>
>> wrote:
>>
>> Owen mentioned making a Hadoop subproject; we'd have to
>> > hash out what exactly this means (I assume a separate repo still
>> managed by
>> > the Hadoop project), but I think we could make this work if it's more
>> > attractive than incubation or a new TLP.
>>
>>
>> Ok, there are multiple levels of sub-projects that all make sense:
>>
>>- Same source tree, same releases - examples like HDFS & YARN
>>- Same master branch, separate releases and release branches - Hive's
>>Storage API vs Hive. It is in the source tree for the master branch,
>> but
>>has distinct releases and release branches.
>>- Separate source, separate release - Apache Commons.
>>
>> There are advantages and disadvantages to each. I'd propose that we use
>> the
>> same source, same release pattern for Ozone. Note that we tried and later
>> reverted doing Common, HDFS, and YARN as separate source, separate release
>> because it was too much trouble. I like Daryn's idea of putting it as a
>> top
>> level directory in Hadoop and making sure that nothing in Common, HDFS, or
>> YARN depend on it. That way if a Release Manager doesn't think it is ready
>> for release, it can be trivially removed before the release.
>>
>> One thing about using the same releases, Sanjay and Jitendra are signing
>> up
>> to make much more regular bugfix and minor releases in the near future.
>> For
>> example, they'll need to make 3.2 relatively soon to get it released and
>> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
>> project. Hadoop needs more regular releases and fewer big bang releases.
>>
>> .. Owen
>>
>
>


Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-01 Thread Andrew Wang
Hi Sanjay,

I have different opinions about what's important and how to eventually
integrate this code, and that's not because I'm "conveniently ignoring"
your responses. I'm also not making some of the arguments you claim I am
making. Attacking arguments I'm not making is not going to change my mind,
so let's bring it back to the arguments I am making.

Here's what it comes down to: HDFS-on-HDSL is not going to be ready in the
near-term, and it comes with a maintenance cost.

I did read the proposal on HDFS-10419 and I understood that HDFS-on-HDSL
integration does not necessarily require a lock split. However, there still
needs to be refactoring to clearly define the FSN and BM interfaces and
make the BM pluggable so HDSL can be swapped in. This is a major
undertaking and risky. We did a similar refactoring in 2.x which made
backports hard and introduced bugs. I don't think we should have done this
in a minor release.

Furthermore, I don't know what your expectation is on how long it will take
to stabilize HDSL, but this horizon for other storage systems is typically
measured in years rather than months.

Both of these feel like Hadoop 4 items: a ways out yet.

Moving on, there is a non-trivial maintenance cost to having this new code
in the code base. Ozone bugs become our bugs. Ozone dependencies become our
dependencies. Ozone's security flaws are our security flaws. All of this
negatively affects our already lumbering release schedule, and thus our
ability to deliver and iterate on the features we're already trying to
ship. Even if Ozone is separate and off by default, this is still a large
amount of code that comes with a large maintenance cost. I don't want to
incur this cost when the benefit is still a ways out.

We disagree on the necessity of sharing a repo and sharing operational
behaviors. Libraries exist as a method for sharing code. HDFS also hardly
has a monopoly on intermediating storage today. Disks are shared with MR
shuffle, Spark/Impala spill, log output, Kudu, Kafka, etc. Operationally
we've made this work. Having Ozone/HDSL in a separate process can even be
seen as an operational advantage since it's isolated. I firmly believe that
we can solve any implementation issues even with separate processes.

This is why I asked about making this a separate project. Given that these
two efforts (HDSL stabilization and NN refactoring) are a ways out, the
best way to get Ozone/HDSL in the hands of users today is to release it as
its own project. Owen mentioned making a Hadoop subproject; we'd have to
hash out what exactly this means (I assume a separate repo still managed by
the Hadoop project), but I think we could make this work if it's more
attractive than incubation or a new TLP.

I'm excited about the possibilities of both HDSL and the NN refactoring in
ensuring a future for HDFS for years to come. A pluggable block manager
would also let us experiment with things like HDFS-on-S3, increasingly
important in a cloud-centric world. CBlock would bring HDFS to new usecases
around generic container workloads. However, given the timeline for
completing these efforts, now is not the time to merge.

Best,
Andrew

On Thu, Mar 1, 2018 at 5:33 PM, Daryn Sharp  wrote:

> I’m generally neutral and looked foremost at developer impact.  Ie.  Will
> it be so intertwined with hdfs that each project risks destabilizing the
> other?  Will developers with no expertise in ozone will be impeded?  I
> think the answer is currently no.  These are the intersections and some
> concerns based on the assumption ozone is accepted into the project:
>
>
> Common
>
> Appear to be a number of superfluous changes.  The conf servlet must not be
> polluted with specific references and logic for ozone.  We don’t create
> dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
> “ozone free”.
>
>
> Datanode
>
> I expected ozone changes to be intricately linked with the existing blocks
> map, dataset, volume, etc.  Thankfully it’s not.  As an independent
> service, the DN should not be polluted with specific references to ozone.
> If ozone is in the project, the DN should have a generic plugin interface
> conceptually similar to the NM aux services.
>
>
> Namenode
>
> No impact, currently, but certainly will be…
>
>
> Code Location
>
> I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
> I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
> hadoop-hdsl-project.  This clean separation will make it easier to later
> spin off or pull in depending on which way we vote.
>
>
> Dependencies
>
> Owen hit upon his before I could send.  Hadoop is already bursting with
> dependencies, I hope this doesn’t pull in a lot more.
>
>
> ––
>
>
> Do I think ozone be should be a separate project?  If we view it only as a
> competing filesystem, then clearly yes.  If it’s a low risk evolutionary
> step with near-term benefits, no, we want to keep it close and help it
> evolve. 

Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-02-28 Thread Andrew Wang
Resending since the formatting was messed up, let's try plain text this
time:

Hi Jitendra and all,

Thanks for putting this together. I caught up on the discussion on JIRA and
document at HDFS-10419, and still have the same concerns raised earlier
about merging the Ozone branch to trunk.

To recap these questions/concerns at a very high level:

* Wouldn't Ozone benefit from being a separate project?
* Why should it be merged now?

I still believe that both Ozone and Hadoop would benefit from Ozone being a
separate project, and that there is no pressing reason to merge Ozone/HDSL
now.

The primary reason I've heard for merging is that the Ozone is that it's at
a stage where it's ready for user feedback. Second, that it needs to be
merged to start on the NN refactoring for HDFS-on-HDSL.

First, without HDFS-on-HDSL support, users are testing against the Ozone
object storage interface. Ozone and HDSL themselves are implemented as
separate masters and new functionality bolted onto the datanode. It also
doesn't look like HDFS in terms of API or featureset; yes, it speaks
FileSystem, but so do many out-of-tree storage systems like S3, Ceph,
Swift, ADLS etc. Ozone/HDSL does not support popular HDFS features like
erasure coding, encryption, high-availability, snapshots, hflush/hsync (and
thus HBase), or APIs like WebHDFS or NFS. This means that Ozone feels like
a new, different system that could reasonably be deployed and tested
separately from HDFS. It's unlikely to replace many of today's HDFS
deployments, and from what I understand, Ozone was not designed to do this.

Second, the NameNode refactoring for HDFS-on-HDSL by itself is a major
undertaking. The discussion on HDFS-10419 is still ongoing so it’s not
clear what the ultimate refactoring will be, but I do know that the earlier
FSN/BM refactoring during 2.x was very painful (introducing new bugs and
making backports difficult) and probably should have been deferred to a new
major release instead. I think this refactoring is important for the
long-term maintainability of the NN and worth pursuing, but as a Hadoop 4.0
item. Merging HDSL is also not a prerequisite for starting this
refactoring. Really, I see the refactoring as the prerequisite for
HDFS-on-HDSL to be possible.

Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements. There are also publicity and community
benefits; it's an opportunity to build a community focused on the novel
capabilities and architectural choices of Ozone/HDSL. There are examples of
other projects that were "incubated" on a branch in the Hadoop repo before
being spun off to great success.

In conclusion, I'd like to see Ozone succeeding and thriving as a separate
project. Meanwhile, we can work on the HDFS refactoring required to
separate the FSN and BM and make it pluggable. At that point (likely in the
Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.

Best,
Andrew

On Tue, Feb 27, 2018 at 11:23 PM, Andrew Wang <andrew.w...@cloudera.com>
wrote:

>
>
>
>
>
>
>
>
>
> *Hi Jitendra and all,Thanks for putting this together. I caught up on the
> discussion on JIRA and document at HDFS-10419, and still have the same
> concerns raised earlier
> <https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
> about merging the Ozone branch to trunk.To recap these questions/concerns
> at a very high level:* Wouldn't Ozone benefit from being a separate
> project?* Why should it be merged now?I still believe that both Ozone and
> Hadoop would benefit from Ozone being a separate project, and that there is
> no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
> for merging is that the Ozone is that it's at a stage where it's ready for
> user feedback. Second, that it needs to be merged to start on the NN
> refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
> testing against the Ozone object storage interface. Ozone and HDSL
> themselves are implemented as separate masters and new functionality bolted
> onto the datanode. It also doesn't look like HDFS in terms of API or
> featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
> systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
> HDFS features like erasure coding, encryption, high-availability,
> snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
> means that Ozone feels like a new, different system that could reasonably
> be deployed and tested separately from HDFS. It's unlikely to replace many
> of today's HDFS deployments, and from what I understand, Ozone

Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-02-27 Thread Andrew Wang
*Hi Jitendra and all,Thanks for putting this together. I caught up on the
discussion on JIRA and document at HDFS-10419, and still have the same
concerns raised earlier

about merging the Ozone branch to trunk.To recap these questions/concerns
at a very high level:* Wouldn't Ozone benefit from being a separate
project?* Why should it be merged now?I still believe that both Ozone and
Hadoop would benefit from Ozone being a separate project, and that there is
no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
for merging is that the Ozone is that it's at a stage where it's ready for
user feedback. Second, that it needs to be merged to start on the NN
refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
testing against the Ozone object storage interface. Ozone and HDSL
themselves are implemented as separate masters and new functionality bolted
onto the datanode. It also doesn't look like HDFS in terms of API or
featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
HDFS features like erasure coding, encryption, high-availability,
snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
means that Ozone feels like a new, different system that could reasonably
be deployed and tested separately from HDFS. It's unlikely to replace many
of today's HDFS deployments, and from what I understand, Ozone was not
designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
itself is a major undertaking. The discussion on HDFS-10419 is still
ongoing so it’s not clear what the ultimate refactoring will be, but I do
know that the earlier FSN/BM refactoring during 2.x was very painful
(introducing new bugs and making backports difficult) and probably should
have been deferred to a new major release instead. I think this refactoring
is important for the long-term maintainability of the NN and worth
pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
for starting this refactoring. Really, I see the refactoring as the
prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
that Ozone/HDSL itself would benefit from being a separate project. Ozone
could release faster and iterate more quickly if it wasn't hampered by
Hadoop's release schedule and security and compatibility requirements.
There are also publicity and community benefits; it's an opportunity to
build a community focused on the novel capabilities and architectural
choices of Ozone/HDSL. There are examples of other projects that were
"incubated" on a branch in the Hadoop repo before being spun off to great
success.In conclusion, I'd like to see Ozone succeeding and thriving as a
separate project. Meanwhile, we can work on the HDFS refactoring required
to separate the FSN and BM and make it pluggable. At that point (likely in
the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
Best,
Andrew

On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey 
wrote:

> Dear folks,
>We would like to start a vote to merge HDFS-7240 branch into
> trunk. The context can be reviewed in the DISCUSSION thread, and in the
> jiras (See references below).
>
> HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is
> a distributed, replicated block layer.
> The old HDFS namespace and NN can be connected to this new block layer
> as we have described in HDFS-10419.
> We also introduce a key-value namespace called Ozone built on HDSL.
>
> The code is in a separate module and is turned off by default. In a
> secure setup, HDSL and Ozone daemons cannot be started.
>
> The detailed documentation is available at
>  https://cwiki.apache.org/confluence/display/HADOOP/
> Hadoop+Distributed+Storage+Layer+and+Applications
>
>
> I will start with my vote.
> +1 (binding)
>
>
> Discussion Thread:
>   https://s.apache.org/7240-merge
>   https://s.apache.org/4sfU
>
> Jiras:
>https://issues.apache.org/jira/browse/HDFS-7240
>https://issues.apache.org/jira/browse/HDFS-10419
>https://issues.apache.org/jira/browse/HDFS-13074
>https://issues.apache.org/jira/browse/HDFS-13180
>
>
> Thanks
> jitendra
>
>
>
>
>
> DISCUSSION THREAD SUMMARY :
>
> On 2/13/18, 6:28 PM, "sanjay Radia" 
> wrote:
>
> Sorry the formatting got messed by my email client.  Here
> it is again
>
>
> Dear
>  Hadoop Community Members,
>
>We had multiple community discussions, a few meetings
> in smaller groups and also jira discussions with respect to 

Re: [DISCUSS] 2.9+ stabilization branch

2018-02-27 Thread Andrew Wang
Hi Konst and all,

Is there a list of 3.0 specific upgrade concerns that you could share? I
understand that a new major release comes with risk simply due to the
amount of code change, but we've done our best as a community to alleviate
these concerns through much improved integration testing and compatibility
efforts like the shaded client and revamped compat guide. I'd love to hear
about what else we can do here to improve our 3.x upgrade story.

I understand the need for a bridge release as an upgrade path to 3.x, but I
want to make sure we don't end up needing a 2.11 or 2.12 also. The scope
mentioned here isn't really bridging improvements, which in in my mind are
compatibility improvements that help with running 2.x and 3.x clients
concurrently to enable a later upgrade to just 3.x. Including new features
makes this harder (or at least not easier), and means more ongoing
maintenance work on 2.x.

So, a hearty +1 to your closing statement: if we're going to do a bridge
release, let's do it right and do it once.

Best,
Andrew

On Tue, Feb 27, 2018 at 6:21 PM, Konstantin Shvachko <shv.had...@gmail.com>
wrote:

> Thanks Subru for initiating the thread about GPU support.
> I think the path of taking 2.9 as a base for 2.10 and adding new resource
> types into it is quite reasonable.
> That way we can combine stabilization effort on 2.9 with GPUs.
>
> Arun, upgrading Java is probably a separate topic.
> We should discuss it on a separate followup thread if we agree to add GPU
> support into 2.10.
>
> Andrew, we actually ran a small 3.0 cluster to experiment with Tensorflow
> on YARN with gpu resources. It worked well! Therefore the interest.
> Although given the breadth (and the quantity) of our use cases it is
> infeasible to jump directly to 3.0, as Jonathan explained.
> A transitional stage such as 2.10 will be required. Probably the same for
> many other big-cluster folks.
> It would be great if people who run different hadoop versions <= 2.8 can
> converge at 2.10 bridge, to help cross over to 3.
> GPU support would be a serious catalyst for us to move forward, which I
> also heard from other organizations interested in ML.
>
> Thanks,
> --Konstantin
>
> On Tue, Feb 27, 2018 at 1:28 PM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
>
>> Hi Arun/Subru,
>>
>> Bumping the minimum Java version is a major change, and incompatible for
>> users who are unable to upgrade their JVM version. We're beyond the EOL
>> for
>> Java 7, but as we know from our experience with Java 6, there are plenty
>> of
>> users who stick on old Java versions. Bumping the Java version also makes
>> backports more difficult, and we're still maintaining a number of older
>> 2.x
>> releases. I think this is too big for a minor release, particularly when
>> we
>> have 3.x as an option that fully supports Java 8.
>>
>> What's the rationale for bumping it here?
>>
>> I'm also curious if there are known issues with 3.x that we can fix to
>> make
>> 3.x upgrades smoother. I would prefer improving the upgrade experience to
>> backporting major features to 2.x since 3.x is meant to be the delivery
>> vehicle for new features beyond the ones named here.
>>
>> Best,
>> Andrew
>>
>> On Tue, Feb 27, 2018 at 11:01 AM, Arun Suresh <asur...@apache.org> wrote:
>>
>> > Hello folks
>> >
>> > We also think this bridging release opens up an opportunity to bump the
>> > java version in branch-2 to java 8.
>> > Would really love to hear thoughts on that.
>> >
>> > Cheers
>> > -Arun/Subru
>> >
>> >
>> > On Mon, Feb 26, 2018 at 5:18 PM, Jonathan Hung <jyhung2...@gmail.com>
>> > wrote:
>> >
>> > > Hi Subru,
>> > >
>> > > Thanks for starting the discussion.
>> > >
>> > > We (LinkedIn) have an immediate need for resource types and native GPU
>> > > support. Given we are running 2.7 on our main clusters, we decided to
>> > avoid
>> > > deploying hadoop 3.x on our machine learning clusters (and having to
>> > > support two very different hadoop versions). Since for us there is
>> > > considerable risk and work involved in upgrading to hadoop 3, I think
>> > > having a branch-2.10 bridge release for porting important hadoop 3
>> > features
>> > > to branch-2 is a good idea.
>> > >
>> > > Thanks,
>> > >
>> > >
>> > > Jonathan Hung
>> > >
>> > > On Mon, Feb 26, 2018 at 2:37 PM, Subru Krishnan <su...@apache.org>
>> > wrote:
>> > &g

Re: [DISCUSS] 2.9+ stabilization branch

2018-02-27 Thread Andrew Wang
Hi Arun/Subru,

Bumping the minimum Java version is a major change, and incompatible for
users who are unable to upgrade their JVM version. We're beyond the EOL for
Java 7, but as we know from our experience with Java 6, there are plenty of
users who stick on old Java versions. Bumping the Java version also makes
backports more difficult, and we're still maintaining a number of older 2.x
releases. I think this is too big for a minor release, particularly when we
have 3.x as an option that fully supports Java 8.

What's the rationale for bumping it here?

I'm also curious if there are known issues with 3.x that we can fix to make
3.x upgrades smoother. I would prefer improving the upgrade experience to
backporting major features to 2.x since 3.x is meant to be the delivery
vehicle for new features beyond the ones named here.

Best,
Andrew

On Tue, Feb 27, 2018 at 11:01 AM, Arun Suresh  wrote:

> Hello folks
>
> We also think this bridging release opens up an opportunity to bump the
> java version in branch-2 to java 8.
> Would really love to hear thoughts on that.
>
> Cheers
> -Arun/Subru
>
>
> On Mon, Feb 26, 2018 at 5:18 PM, Jonathan Hung 
> wrote:
>
> > Hi Subru,
> >
> > Thanks for starting the discussion.
> >
> > We (LinkedIn) have an immediate need for resource types and native GPU
> > support. Given we are running 2.7 on our main clusters, we decided to
> avoid
> > deploying hadoop 3.x on our machine learning clusters (and having to
> > support two very different hadoop versions). Since for us there is
> > considerable risk and work involved in upgrading to hadoop 3, I think
> > having a branch-2.10 bridge release for porting important hadoop 3
> features
> > to branch-2 is a good idea.
> >
> > Thanks,
> >
> >
> > Jonathan Hung
> >
> > On Mon, Feb 26, 2018 at 2:37 PM, Subru Krishnan 
> wrote:
> >
> > > Folks,
> > >
> > > We (i.e. Microsoft) have started stabilization of 2.9 for our
> production
> > > deployment. During planning, we realized that we need to backport 3.x
> > > features to support GPUs (and more resource types like network IO)
> > natively
> > > as part of the upgrade. We'd like to share that work with the
> community.
> > >
> > > Instead of stabilizing the base release and cherry-picking fixes back
> to
> > > Apache, we want to work publicly and push fixes directly into
> > > trunk/.../branch-2 for a stable 2.10.0 release. Our goal is to create a
> > > bridge release for our production clusters to the 3.x series and to
> > address
> > > scalability problems in large clusters (N*10k nodes). As we find
> issues,
> > we
> > > will file JIRAs and track resolution of significant regressions/faults
> in
> > > wiki. Moreover, LinkedIn also has committed plans for a production
> > > deployment of the same branch. We welcome broad participation,
> > particularly
> > > since we'll be stabilizing relatively new features.
> > >
> > > The exact list of features we would like to backport in YARN are:
> > >
> > >- Support for Resource types [1][2]
> > >- Native support for GPUs[3]
> > >- Absolute Resource configuration in CapacityScheduler [4]
> > >
> > >
> > > With regards to HDFS, we are currently looking at mainly fixes to
> Router
> > > based Federation and Windows specific fixes which should anyways flow
> > > normally.
> > >
> > > Thoughts?
> > >
> > > Thanks,
> > > Subru/Arun
> > >
> > > [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/
> > msg27786.html
> > > [2] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/
> > msg28281.html
> > > [3] https://issues.apache.org/jira/browse/YARN-6223
> > > [4] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/
> > msg28772.html
> > >
> >
>


Re: Apache Hadoop 3.0.1 Release plan

2018-01-09 Thread Andrew Wang
Hi Eddy, thanks for taking this on,

Historically we've waited for the first RC to cut the release branch since
it keeps things simpler for committers.

Also, could you check the permissions on your JIRA filter? It shows as
private for me.

Best,
Andrew

On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu  wrote:

> Hi, All
>
> We have released Apache Hadoop 3.0.0 in December [1]. To further
> improve the quality of release, we plan to cut branch-3.0.1 branch
> tomorrow for the preparation of Apache Hadoop 3.0.1 release. The focus
> of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug fixes
> [2].  No new features and improvement should be included.
>
> We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on Feb
> 1st, targeting for Feb 9th release.
>
> Please feel free to share your insights.
>
> [1] https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html
> [2] https://issues.apache.org/jira/issues/?filter=12342842
>
> Best,
> --
> Lei (Eddy) Xu
> Software Engineer, Cloudera
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>


Re: [ANNOUNCE] Apache Hadoop 3.0.0 GA is released

2017-12-18 Thread Andrew Wang
Thanks for the spot, I just pushed a correct tag. I can't delete the bad
tag myself, will ask ASF infra for help.

On Mon, Dec 18, 2017 at 4:46 PM, Jonathan Kelly <jonathaka...@gmail.com>
wrote:

> Congrats on the huge release!
>
> I just noticed, though, that the Github repo does not appear to have the
> correct tag for 3.0.0. I see a new tag called "rel/release-" that points to
> the same commit as "release-3.0.0-RC1" 
> (c25427ceca461ee979d30edd7a4b0f50718e6533).
> I assume that should have actually been called "rel/release-3.0.0" to match
> the pattern for prior releases.
>
> Thanks,
> Jonathan Kelly
>
> On Thu, Dec 14, 2017 at 10:45 AM Andrew Wang <andrew.w...@cloudera.com>
> wrote:
>
>> Hi all,
>>
>> I'm pleased to announce that Apache Hadoop 3.0.0 is generally available
>> (GA).
>>
>> 3.0.0 GA consists of 302 bug fixes, improvements, and other enhancements
>> since 3.0.0-beta1. This release marks a point of quality and stability for
>> the 3.0.0 release line, and users of earlier 3.0.0-alpha and -beta
>> releases
>> are encouraged to upgrade.
>>
>> Looking back, 3.0.0 GA is the culmination of over a year of work on the
>> 3.0.0 line, starting with 3.0.0-alpha1 which was released in September
>> 2016. Altogether, 3.0.0 incorporates 6,242 changes since 2.7.0.
>>
>> Users are encouraged to read the overview of major changes
>> <http://hadoop.apache.org/docs/r3.0.0/index.html> in 3.0.0. The GA
>> release
>> notes
>> <http://hadoop.apache.org/docs/r3.0.0/hadoop-project-
>> dist/hadoop-common/release/3.0.0/RELEASENOTES.3.0.0.html>
>>  and changelog
>> <http://hadoop.apache.org/docs/r3.0.0/hadoop-project-
>> dist/hadoop-common/release/3.0.0/CHANGES.3.0.0.html>
>> detail
>> the changes since 3.0.0-beta1.
>>
>> The ASF press release provides additional color and highlights some of the
>> major features:
>>
>> https://globenewswire.com/news-release/2017/12/14/
>> 1261879/0/en/The-Apache-Software-Foundation-Announces-
>> Apache-Hadoop-v3-0-0-General-Availability.html
>>
>> Let me end by thanking the many, many contributors who helped with this
>> release line. We've only had three major releases in Hadoop's 10 year
>> history, and this is our biggest major release ever. It's an incredible
>> accomplishment for our community, and I'm proud to have worked with all of
>> you.
>>
>> Best,
>> Andrew
>>
>


Re: [ANNOUNCE] Apache Hadoop 3.0.0 GA is released

2017-12-18 Thread Andrew Wang
Moving general@ to BCC,

The main page and releases posts on hadoop.apache.org are pretty clear
about this being a diff from beta1, am I missing something? Pasted below:

After four alpha releases and one beta release, 3.0.0 is generally
available. 3.0.0 consists of 302 bug fixes, improvements, and other
enhancements since 3.0.0-beta1. All together, 6242 issues were fixed as
part of the 3.0.0 release series since 2.7.0.

Users are encouraged to read the overview of major changes
<http://hadoop.apache.org/docs/r3.0.0/index.html> in 3.0.0. The GA release
notes
<http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/release/3.0.0/RELEASENOTES.3.0.0.html>
 and changelog
<http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/release/3.0.0/CHANGES.3.0.0.html>
detail
the changes since 3.0.0-beta1.



On Mon, Dec 18, 2017 at 10:32 AM, Arpit Agarwal <aagar...@hortonworks.com>
wrote:

> That makes sense for Beta users but most of our users will be upgrading
> from a previous GA release and the changelog will mislead them. The webpage
> does not mention this is a delta from the beta release.
>
>
>
>
>
> *From: *Andrew Wang <andrew.w...@cloudera.com>
> *Date: *Friday, December 15, 2017 at 10:36 AM
> *To: *Arpit Agarwal <aagar...@hortonworks.com>
> *Cc: *general <gene...@hadoop.apache.org>, "common-dev@hadoop.apache.org"
> <common-dev@hadoop.apache.org>, "yarn-...@hadoop.apache.org" <
> yarn-...@hadoop.apache.org>, "mapreduce-...@hadoop.apache.org" <
> mapreduce-...@hadoop.apache.org>, "hdfs-...@hadoop.apache.org" <
> hdfs-...@hadoop.apache.org>
> *Subject: *Re: [ANNOUNCE] Apache Hadoop 3.0.0 GA is released
>
>
>
> Hi Arpit,
>
>
>
> If you look at the release announcements, it's made clear that the
> changelog for 3.0.0 is diffed based on beta1. This is important since users
> need to know what's different from the previous 3.0.0-* releases if they're
> upgrading.
>
>
>
> I agree there's additional value to making combined release notes, but
> it'd be something additive rather than replacing what's there.
>
>
>
> Best,
>
> Andrew
>
>
>
> On Fri, Dec 15, 2017 at 8:27 AM, Arpit Agarwal <aagar...@hortonworks.com>
> wrote:
>
>
> Hi Andrew,
>
> Thank you for all the hard work on this release. I was out the last few
> days and didn’t get a chance to evaluate RC1 earlier.
>
> The changelog looks incorrect. E.g. This gives an impression that there
> are just 5 incompatible changes in 3.0.0.
> http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/
> hadoop-common/release/3.0.0/CHANGES.3.0.0.html
>
> I assume you only counted 3.0.0 changes in this log excluding
> alphas/betas. However, users shouldn’t have to manually compile
> incompatibilities by summing up a/b release notes. Can we fix the changelog
> after the fact?
>
>
>
>
> On 12/14/17, 10:45 AM, "Andrew Wang" <andrew.w...@cloudera.com> wrote:
>
> Hi all,
>
> I'm pleased to announce that Apache Hadoop 3.0.0 is generally available
> (GA).
>
> 3.0.0 GA consists of 302 bug fixes, improvements, and other
> enhancements
> since 3.0.0-beta1. This release marks a point of quality and stability
> for
> the 3.0.0 release line, and users of earlier 3.0.0-alpha and -beta
> releases
> are encouraged to upgrade.
>
> Looking back, 3.0.0 GA is the culmination of over a year of work on the
> 3.0.0 line, starting with 3.0.0-alpha1 which was released in September
> 2016. Altogether, 3.0.0 incorporates 6,242 changes since 2.7.0.
>
> Users are encouraged to read the overview of major changes
> <http://hadoop.apache.org/docs/r3.0.0/index.html> in 3.0.0. The GA
> release
> notes
> <http://hadoop.apache.org/docs/r3.0.0/hadoop-project-
> dist/hadoop-common/release/3.0.0/RELEASENOTES.3.0.0.html>
>  and changelog
> <http://hadoop.apache.org/docs/r3.0.0/hadoop-project-
> dist/hadoop-common/release/3.0.0/CHANGES.3.0.0.html>
>
> detail
> the changes since 3.0.0-beta1.
>
> The ASF press release provides additional color and highlights some of
> the
> major features:
>
> https://globenewswire.com/news-release/2017/12/14/
> 1261879/0/en/The-Apache-Software-Foundation-Announces-
> Apache-Hadoop-v3-0-0-General-Availability.html
>
> Let me end by thanking the many, many contributors who helped with this
> release line. We've only had three major releases in Hadoop's 10 year
> history, and this is our biggest major release ever. It's an incredible
> accomplishment for our community, and I'm proud to have worked with
> all of
> you.
>
> Best,
> Andrew
>
>
>
>
>
>
>
>


Re: Missing some trunk commit history

2017-12-15 Thread Andrew Wang
We actually already did:

https://lists.apache.org/thread.html/43cd65c6b6c3c0e8ac2b3c76afd9eff1f78b177fabe9c4a96d9b3d0b@1440189889@%3Ccommon-dev.hadoop.apache.org%3E

On Fri, Dec 15, 2017 at 10:54 AM, Eric Yang  wrote:

> +1 for merge –no-ff for feature merge.
> Do we all agree on this optimization for going forward?
>
> Regards,
> Eric
>
> On 12/15/17, 10:34 AM, "Chris Douglas"  wrote:
>
> On Thu, Dec 14, 2017 at 9:40 PM, Eric Yang  wrote:
> > I am looking for a way to reduce time spent on testing latest
> commits.
> > [...]
> > People who did the
> > feature merge is likely already did the full build test to ensure
> they
> > didn't break trunk, but there is no easy indicator where the rebase
> start
> > and ends.
>
> OK, I think I understand. If we force a merge commit (i.e., specify
> --no-ff during the merge) then I think that has the property you're
> looking for without squashing all the history into a single commit. -C
>
> > Therefore, other people will have to spend extra time to test
> > each commit individually.  It reduces the productivity for me to
> prove that
> > my pre-commit patch unit test failure was caused by other's check
> in.  I
> > lost the entire day to isolate trunk build breakage for node manager
> was
> > caused by YARN-7381, and I was only able to find this base on github
> method
> > to sort commits by date instead of git log approach of showing commit
> > histories.  If I was testing this one by one based on git log, then
> I am
> > probably not done testing yet.  If we can propose to use merge
> without
> > rebase for trunk, it might be more efficient for analyze bugs for
> > pre-commit builds.
> >
> > regards,
> > Eric
> >
> > On Thu, Dec 14, 2017 at 6:52 PM, Chris Douglas 
> wrote:
> >
> >> Eric-
> >>
> >> What problem are you trying to solve? Most of us understand how git
> works,
> >> you can omit that. -C
> >>
> >> On Thu, Dec 14, 2017 at 6:31 PM Eric Yang 
> wrote:
> >>
> >> > We are currently requesting committer to commit code base on:
> >> > https://wiki.apache.org/hadoop/HowToCommit
> >> >
> >> > To set branch.autosetuprebase always:
> >> >
> >> > Base on the current preference, the history is linear, and it is
> >> described
> >> > in this graph as Rebase and Merge:
> >> >
> >> >
> >> > https://wac-cdn.atlassian.com/dam/jcr:df39b1f1-2686-4ee5-
> >> 90bf-9836783342ce/10.svg?cdnVersion=iq
> >> >
> >> > It could cause a false alarm on blaming the wrong person for trunk
> >> > breakage because it takes more time to iterate through all
> commits from
> >> > feature branch, while the recent commits (blue dots), are much
> further
> >> back
> >> > in history base on the rebase.  If it was only one merge commit,
> it would
> >> > be faster to skip through the entire branch and find recent
> breakages.
> >> >
> >> > When there are several feature branches merged in short period of
> time,
> >> > the extra work done to check history revision of branches took
> much more
> >> > time.  This is a pain point for people that care about trunk
> stability
> >> but
> >> > can’t afford all day to run full build base on each commit to
> isolate the
> >> > breakage.
> >> >
> >> > I understand your usage for looking at multiple branches to find
> a commit
> >> > to make sure maintenance branches have the proper commits or
> backport.
> >> > Rebase + merge works best for maintenance branches.  However, I
> am not
> >> > convinced that rebase + merge strategy is the efficient way to
> manage
> >> trunk
> >> > stability.  Is there be a better way to manage this?  Probably,
> we can
> >> > recommend trunk to use merge without rebase, but maintenance
> branches
> >> apply
> >> > rebase + merge strategy.  Thoughts?
> >> >
> >> > regards,
> >> > Eric
> >> >
> >> > On 12/14/17, 5:16 PM, "Chris Douglas" 
> wrote:
> >> >
> >> > I'm sorry, I literally don't understand what you've written.
> What do
> >> > clicks
> >> > on github have to do with merges?
> >> >
> >> > Are you talking about git bisect, where one would first
> identify the
> >> > branch
> >> > where the error was introduced, then run a second regression
> over the
> >> > feature branch? With similar semantics for blame?
> >> >
> >> > Again, I'd rather have the history of the branch, with
> rebases prior
> >> to
> >> > merge to ensure that feature branches don't create
> particularly
> >> > complicated
> >> > graphs.
> >> >
> >> > Perhaps I haven't understood the problem you're solving. The
> thread
> >> > 

Re: [ANNOUNCE] Apache Hadoop 3.0.0 GA is released

2017-12-15 Thread Andrew Wang
Hi Arpit,

If you look at the release announcements, it's made clear that the
changelog for 3.0.0 is diffed based on beta1. This is important since users
need to know what's different from the previous 3.0.0-* releases if they're
upgrading.

I agree there's additional value to making combined release notes, but it'd
be something additive rather than replacing what's there.

Best,
Andrew

On Fri, Dec 15, 2017 at 8:27 AM, Arpit Agarwal <aagar...@hortonworks.com>
wrote:

>
> Hi Andrew,
>
> Thank you for all the hard work on this release. I was out the last few
> days and didn’t get a chance to evaluate RC1 earlier.
>
> The changelog looks incorrect. E.g. This gives an impression that there
> are just 5 incompatible changes in 3.0.0.
> http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/
> hadoop-common/release/3.0.0/CHANGES.3.0.0.html
>
> I assume you only counted 3.0.0 changes in this log excluding
> alphas/betas. However, users shouldn’t have to manually compile
> incompatibilities by summing up a/b release notes. Can we fix the changelog
> after the fact?
>
>
>
>
> On 12/14/17, 10:45 AM, "Andrew Wang" <andrew.w...@cloudera.com> wrote:
>
> Hi all,
>
> I'm pleased to announce that Apache Hadoop 3.0.0 is generally available
> (GA).
>
> 3.0.0 GA consists of 302 bug fixes, improvements, and other
> enhancements
> since 3.0.0-beta1. This release marks a point of quality and stability
> for
> the 3.0.0 release line, and users of earlier 3.0.0-alpha and -beta
> releases
> are encouraged to upgrade.
>
> Looking back, 3.0.0 GA is the culmination of over a year of work on the
> 3.0.0 line, starting with 3.0.0-alpha1 which was released in September
> 2016. Altogether, 3.0.0 incorporates 6,242 changes since 2.7.0.
>
> Users are encouraged to read the overview of major changes
> <http://hadoop.apache.org/docs/r3.0.0/index.html> in 3.0.0. The GA
> release
> notes
> <http://hadoop.apache.org/docs/r3.0.0/hadoop-project-
> dist/hadoop-common/release/3.0.0/RELEASENOTES.3.0.0.html>
>  and changelog
> <http://hadoop.apache.org/docs/r3.0.0/hadoop-project-
> dist/hadoop-common/release/3.0.0/CHANGES.3.0.0.html>
> detail
> the changes since 3.0.0-beta1.
>
> The ASF press release provides additional color and highlights some of
> the
> major features:
>
> https://globenewswire.com/news-release/2017/12/14/
> 1261879/0/en/The-Apache-Software-Foundation-Announces-
> Apache-Hadoop-v3-0-0-General-Availability.html
>
> Let me end by thanking the many, many contributors who helped with this
> release line. We've only had three major releases in Hadoop's 10 year
> history, and this is our biggest major release ever. It's an incredible
> accomplishment for our community, and I'm proud to have worked with
> all of
> you.
>
> Best,
> Andrew
>
>
>
>
>
>
>
>


[ANNOUNCE] Apache Hadoop 3.0.0 GA is released

2017-12-14 Thread Andrew Wang
Hi all,

I'm pleased to announce that Apache Hadoop 3.0.0 is generally available
(GA).

3.0.0 GA consists of 302 bug fixes, improvements, and other enhancements
since 3.0.0-beta1. This release marks a point of quality and stability for
the 3.0.0 release line, and users of earlier 3.0.0-alpha and -beta releases
are encouraged to upgrade.

Looking back, 3.0.0 GA is the culmination of over a year of work on the
3.0.0 line, starting with 3.0.0-alpha1 which was released in September
2016. Altogether, 3.0.0 incorporates 6,242 changes since 2.7.0.

Users are encouraged to read the overview of major changes
 in 3.0.0. The GA release
notes

 and changelog

detail
the changes since 3.0.0-beta1.

The ASF press release provides additional color and highlights some of the
major features:

https://globenewswire.com/news-release/2017/12/14/1261879/0/en/The-Apache-Software-Foundation-Announces-Apache-Hadoop-v3-0-0-General-Availability.html

Let me end by thanking the many, many contributors who helped with this
release line. We've only had three major releases in Hadoop's 10 year
history, and this is our biggest major release ever. It's an incredible
accomplishment for our community, and I'm proud to have worked with all of
you.

Best,
Andrew


Re: [VOTE] Release Apache Hadoop 3.0.0 RC1

2017-12-13 Thread Andrew Wang
Hi folks,

To close this out, the vote passes successfully with 13 binding +1s, 5
non-binding +1s, and no -1s. Thanks everyone for voting! I'll work on
staging.

I'm hoping we can address YARN-7588 and any remaining rolling upgrade
issues in 3.0.x maintenance releases. Beyond a wiki page, it would be
really great to get JIRAs filed and targeted for tracking as soon as
possible.

Vinod, what do you think we need to do regarding caveating rolling upgrade
support? We haven't advertised rolling upgrade support between major
releases outside of dev lists and JIRA. As a new major release, our compat
guidelines allow us to break compatibility, so I don't think it's expected
by users.

Best,
Andrew

On Wed, Dec 13, 2017 at 12:37 PM, Vinod Kumar Vavilapalli <
vino...@apache.org> wrote:

> I was waiting for Daniel to post the minutes from YARN meetup to talk
> about this. Anyways, in that discussion, we identified a bunch of key
> upgrade related scenarios that no-one seems to have validated - atleast
> from the representation in the YARN meetup. I'm going to create a wiki-page
> listing all these scenarios.
>
> But back to the bug that Junping raised. At this point, we don't have a
> clear path towards running 2.x applications on 3.0.0 clusters. So, our
> claim of rolling-upgrades already working is not accurate.
>
> One of the two options that Junping proposed should be pursued before we
> close the release. I'm in favor of calling out rolling-upgrade support be
> with-drawn or caveated and push for progress instead of blocking the
> release.
>
> Thanks
> +Vinod
>
> > On Dec 12, 2017, at 5:44 PM, Junping Du <j...@hortonworks.com> wrote:
> >
> > Thanks Andrew for pushing new RC for 3.0.0. I was out last week, just
> get chance to validate new RC now.
> >
> > Basically, I found two critical issues with the same rolling upgrade
> scenario as where HADOOP-15059 get found previously:
> > HDFS-12920, we changed value format for some hdfs configurations that
> old version MR client doesn't understand when fetching these
> configurations. Some quick workarounds are to add old value (without time
> unit) in hdfs-site.xml to override new default values but will generate
> many annoying warnings. I provided my fix suggestions on the JIRA already
> for more discussion.
> > The other one is YARN-7646. After we workaround HDFS-12920, will hit the
> issue that old version MR AppMaster cannot communicate with new version of
> YARN RM - could be related to resource profile changes from YARN side but
> root cause are still in investigation.
> >
> > The first issue may not belong to a blocker given we can workaround this
> without code change. I am not sure if we can workaround 2nd issue so far.
> If not, we may have to fix this or compromise with withdrawing support of
> rolling upgrade or calling it a stable release.
> >
> >
> > Thanks,
> >
> > Junping
> >
> > 
> > From: Robert Kanter <rkan...@cloudera.com>
> > Sent: Tuesday, December 12, 2017 3:10 PM
> > To: Arun Suresh
> > Cc: Andrew Wang; Lei Xu; Wei-Chiu Chuang; Ajay Kumar; Xiao Chen; Aaron
> T. Myers; common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org;
> yarn-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org
> > Subject: Re: [VOTE] Release Apache Hadoop 3.0.0 RC1
> >
> > +1 (binding)
> >
> > + Downloaded the binary release
> > + Deployed on a 3 node cluster on CentOS 7.3
> > + Ran some MR jobs, clicked around the UI, etc
> > + Ran some CLI commands (yarn logs, etc)
> >
> > Good job everyone on Hadoop 3!
> >
> >
> > - Robert
> >
> > On Tue, Dec 12, 2017 at 1:56 PM, Arun Suresh <asur...@apache.org> wrote:
> >
> >> +1 (binding)
> >>
> >> - Verified signatures of the source tarball.
> >> - built from source - using the docker build environment.
> >> - set up a pseudo-distributed test cluster.
> >> - ran basic HDFS commands
> >> - ran some basic MR jobs
> >>
> >> Cheers
> >> -Arun
> >>
> >> On Tue, Dec 12, 2017 at 1:52 PM, Andrew Wang <andrew.w...@cloudera.com>
> >> wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> As a reminder, this vote closes tomorrow at 12:31pm, so please give it
> a
> >>> whack if you have time. There are already enough binding +1s to pass
> this
> >>> vote, but it'd be great to get additional validation.
> >>>
> >>> Thanks to everyone who's voted thus far!
> >>>
> >>> Best,
> >>> Andrew
> >>>
> >>>
>

Re: [VOTE] Release Apache Hadoop 3.0.0 RC1

2017-12-12 Thread Andrew Wang
Hi everyone,

As a reminder, this vote closes tomorrow at 12:31pm, so please give it a
whack if you have time. There are already enough binding +1s to pass this
vote, but it'd be great to get additional validation.

Thanks to everyone who's voted thus far!

Best,
Andrew



On Tue, Dec 12, 2017 at 11:08 AM, Lei Xu <l...@cloudera.com> wrote:

> +1 (binding)
>
> * Verified src tarball and bin tarball, verified md5 of each.
> * Build source with -Pdist,native
> * Started a pseudo cluster
> * Run ec -listPolicies / -getPolicy / -setPolicy on /  , and run hdfs
> dfs put/get/cat on "/" with XOR-2-1 policy.
>
> Thanks Andrew for this great effort!
>
> Best,
>
>
> On Tue, Dec 12, 2017 at 9:55 AM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
> > Hi Wei-Chiu,
> >
> > The patchprocess directory is left over from the create-release process,
> > and it looks empty to me. We should still file a create-release JIRA to
> fix
> > this, but I think this is not a blocker. Would you agree?
> >
> > Best,
> > Andrew
> >
> > On Tue, Dec 12, 2017 at 9:44 AM, Wei-Chiu Chuang <weic...@cloudera.com>
> > wrote:
> >
> >> Hi Andrew, thanks the tremendous effort.
> >> I found an empty "patchprocess" directory in the source tarball, that is
> >> not there if you clone from github. Any chance you might have some
> leftover
> >> trash when you made the tarball?
> >> Not wanting to nitpicking, but you might want to double check so we
> don't
> >> ship anything private to you in public :)
> >>
> >>
> >>
> >> On Tue, Dec 12, 2017 at 7:48 AM, Ajay Kumar <ajay.ku...@hortonworks.com
> >
> >> wrote:
> >>
> >>> +1 (non-binding)
> >>> Thanks for driving this, Andrew Wang!!
> >>>
> >>> - downloaded the src tarball and verified md5 checksum
> >>> - built from source with jdk 1.8.0_111-b14
> >>> - brought up a pseudo distributed cluster
> >>> - did basic file system operations (mkdir, list, put, cat) and
> >>> confirmed that everything was working
> >>> - Run word count, pi and DFSIOTest
> >>> - run hdfs and yarn, confirmed that the NN, RM web UI worked
> >>>
> >>> Cheers,
> >>> Ajay
> >>>
> >>> On 12/11/17, 9:35 PM, "Xiao Chen" <x...@cloudera.com> wrote:
> >>>
> >>> +1 (binding)
> >>>
> >>> - downloaded src tarball, verified md5
> >>> - built from source with jdk1.8.0_112
> >>> - started a pseudo cluster with hdfs and kms
> >>> - sanity checked encryption related operations working
> >>> - sanity checked webui and logs.
> >>>
> >>> -Xiao
> >>>
> >>> On Mon, Dec 11, 2017 at 6:10 PM, Aaron T. Myers <a...@apache.org>
> >>> wrote:
> >>>
> >>> > +1 (binding)
> >>> >
> >>> > - downloaded the src tarball and built the source (-Pdist
> -Pnative)
> >>> > - verified the checksum
> >>> > - brought up a secure pseudo distributed cluster
> >>> > - did some basic file system operations (mkdir, list, put, cat)
> and
> >>> > confirmed that everything was working
> >>> > - confirmed that the web UI worked
> >>> >
> >>> > Best,
> >>> > Aaron
> >>> >
> >>> > On Fri, Dec 8, 2017 at 12:31 PM, Andrew Wang <
> >>> andrew.w...@cloudera.com>
> >>> > wrote:
> >>> >
> >>> > > Hi all,
> >>> > >
> >>> > > Let me start, as always, by thanking the efforts of all the
> >>> contributors
> >>> > > who contributed to this release, especially those who jumped on
> >>> the
> >>> > issues
> >>> > > found in RC0.
> >>> > >
> >>> > > I've prepared RC1 for Apache Hadoop 3.0.0. This release
> >>> incorporates 302
> >>> > > fixed JIRAs since the previous 3.0.0-beta1 release.
> >>> > >
> >>> > > You can find the artifacts here:
> >>> > >
> >>> > > http://home.apache.org/~wang/3.0.0-RC1/
> >>> > >
> >>> > > I've done the traditional testing of building from the source
> >>> tarball and
> >>> > > running a Pi job on a single node cluster. I also verified that
> >>> the
> >>> > shaded
> >>> > > jars are not empty.
> >>> > >
> >>> > > Found one issue that create-release (probably due to the mvn
> >>> deploy
> >>> > change)
> >>> > > didn't sign the artifacts, but I fixed that by calling mvn one
> >>> more time.
> >>> > > Available here:
> >>> > >
> >>> > > https://repository.apache.org/content/repositories/orgapache
> >>> hadoop-1075/
> >>> > >
> >>> > > This release will run the standard 5 days, closing on Dec 13th
> at
> >>> 12:31pm
> >>> > > Pacific. My +1 to start.
> >>> > >
> >>> > > Best,
> >>> > > Andrew
> >>> > >
> >>> >
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> -
> >>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>>
> >>
> >>
> >>
> >>
>
>
>
> --
> Lei (Eddy) Xu
> Software Engineer, Cloudera
>


[jira] [Created] (HADOOP-15112) create-release didn't sign artifacts

2017-12-12 Thread Andrew Wang (JIRA)
Andrew Wang created HADOOP-15112:


 Summary: create-release didn't sign artifacts
 Key: HADOOP-15112
 URL: https://issues.apache.org/jira/browse/HADOOP-15112
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Andrew Wang


While building the 3.0.0 RC1, I had to re-invoke Maven because the 
create-release script didn't deploy signatures to Nexus. Looking at the repo 
(and my artifacts), it seems like "sign" didn't run properly.

I lost my create-release output, but I noticed that it will log and continue 
rather than abort in some error conditions. This might have caused my lack of 
signatures. IMO it'd be better to explicitly fail in these situations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.0.0 RC1

2017-12-12 Thread Andrew Wang
Hi Wei-Chiu,

The patchprocess directory is left over from the create-release process,
and it looks empty to me. We should still file a create-release JIRA to fix
this, but I think this is not a blocker. Would you agree?

Best,
Andrew

On Tue, Dec 12, 2017 at 9:44 AM, Wei-Chiu Chuang <weic...@cloudera.com>
wrote:

> Hi Andrew, thanks the tremendous effort.
> I found an empty "patchprocess" directory in the source tarball, that is
> not there if you clone from github. Any chance you might have some leftover
> trash when you made the tarball?
> Not wanting to nitpicking, but you might want to double check so we don't
> ship anything private to you in public :)
>
>
>
> On Tue, Dec 12, 2017 at 7:48 AM, Ajay Kumar <ajay.ku...@hortonworks.com>
> wrote:
>
>> +1 (non-binding)
>> Thanks for driving this, Andrew Wang!!
>>
>> - downloaded the src tarball and verified md5 checksum
>> - built from source with jdk 1.8.0_111-b14
>> - brought up a pseudo distributed cluster
>> - did basic file system operations (mkdir, list, put, cat) and
>> confirmed that everything was working
>> - Run word count, pi and DFSIOTest
>> - run hdfs and yarn, confirmed that the NN, RM web UI worked
>>
>> Cheers,
>> Ajay
>>
>> On 12/11/17, 9:35 PM, "Xiao Chen" <x...@cloudera.com> wrote:
>>
>> +1 (binding)
>>
>> - downloaded src tarball, verified md5
>> - built from source with jdk1.8.0_112
>> - started a pseudo cluster with hdfs and kms
>> - sanity checked encryption related operations working
>> - sanity checked webui and logs.
>>
>> -Xiao
>>
>> On Mon, Dec 11, 2017 at 6:10 PM, Aaron T. Myers <a...@apache.org>
>> wrote:
>>
>> > +1 (binding)
>> >
>> > - downloaded the src tarball and built the source (-Pdist -Pnative)
>> > - verified the checksum
>> > - brought up a secure pseudo distributed cluster
>> > - did some basic file system operations (mkdir, list, put, cat) and
>> > confirmed that everything was working
>> > - confirmed that the web UI worked
>> >
>> > Best,
>> > Aaron
>> >
>> > On Fri, Dec 8, 2017 at 12:31 PM, Andrew Wang <
>> andrew.w...@cloudera.com>
>> > wrote:
>> >
>> > > Hi all,
>> > >
>> > > Let me start, as always, by thanking the efforts of all the
>> contributors
>> > > who contributed to this release, especially those who jumped on
>> the
>> > issues
>> > > found in RC0.
>> > >
>> > > I've prepared RC1 for Apache Hadoop 3.0.0. This release
>> incorporates 302
>> > > fixed JIRAs since the previous 3.0.0-beta1 release.
>> > >
>> > > You can find the artifacts here:
>> > >
>> > > http://home.apache.org/~wang/3.0.0-RC1/
>> > >
>> > > I've done the traditional testing of building from the source
>> tarball and
>> > > running a Pi job on a single node cluster. I also verified that
>> the
>> > shaded
>> > > jars are not empty.
>> > >
>> > > Found one issue that create-release (probably due to the mvn
>> deploy
>> > change)
>> > > didn't sign the artifacts, but I fixed that by calling mvn one
>> more time.
>> > > Available here:
>> > >
>> > > https://repository.apache.org/content/repositories/orgapache
>> hadoop-1075/
>> > >
>> > > This release will run the standard 5 days, closing on Dec 13th at
>> 12:31pm
>> > > Pacific. My +1 to start.
>> > >
>> > > Best,
>> > > Andrew
>> > >
>> >
>>
>>
>>
>>
>>
>>
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>
>
>
>


Re: [VOTE] Release Apache Hadoop 3.0.0 RC1

2017-12-11 Thread Andrew Wang
Good point on the mutability. Release tags are immutable, RCs are not.

On Mon, Dec 11, 2017 at 1:39 PM, Sangjin Lee <sj...@apache.org> wrote:

> Thanks Andrew. For the record, the commit id would be
> c25427ceca461ee979d30edd7a4b0f50718e6533. I mention that for completeness
> because of the mutability of tags.
>
> On Mon, Dec 11, 2017 at 10:31 AM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
>
>> Sorry, forgot to push the tag. It's up there now.
>>
>> On Sun, Dec 10, 2017 at 8:31 PM, Vinod Kumar Vavilapalli <
>> vino...@apache.org> wrote:
>>
>>> I couldn't find the release tag for RC1 either - is it just me or has
>>> the release-process changed?
>>>
>>> +Vinod
>>>
>>> > On Dec 10, 2017, at 4:31 PM, Sangjin Lee <sj...@apache.org> wrote:
>>> >
>>> > Hi Andrew,
>>> >
>>> > Thanks much for your effort! Just to be clear, could you please state
>>> the
>>> > git commit id of the RC1 we're voting for?
>>> >
>>> > Sangjin
>>> >
>>> > On Fri, Dec 8, 2017 at 12:31 PM, Andrew Wang <andrew.w...@cloudera.com
>>> >
>>> > wrote:
>>> >
>>> >> Hi all,
>>> >>
>>> >> Let me start, as always, by thanking the efforts of all the
>>> contributors
>>> >> who contributed to this release, especially those who jumped on the
>>> issues
>>> >> found in RC0.
>>> >>
>>> >> I've prepared RC1 for Apache Hadoop 3.0.0. This release incorporates
>>> 302
>>> >> fixed JIRAs since the previous 3.0.0-beta1 release.
>>> >>
>>> >> You can find the artifacts here:
>>> >>
>>> >> http://home.apache.org/~wang/3.0.0-RC1/
>>> >>
>>> >> I've done the traditional testing of building from the source tarball
>>> and
>>> >> running a Pi job on a single node cluster. I also verified that the
>>> shaded
>>> >> jars are not empty.
>>> >>
>>> >> Found one issue that create-release (probably due to the mvn deploy
>>> change)
>>> >> didn't sign the artifacts, but I fixed that by calling mvn one more
>>> time.
>>> >> Available here:
>>> >>
>>> >> https://repository.apache.org/content/repositories/orgapache
>>> hadoop-1075/
>>> >>
>>> >> This release will run the standard 5 days, closing on Dec 13th at
>>> 12:31pm
>>> >> Pacific. My +1 to start.
>>> >>
>>> >> Best,
>>> >> Andrew
>>> >>
>>>
>>>
>>
>


[VOTE] Release Apache Hadoop 3.0.0 RC1

2017-12-08 Thread Andrew Wang
Hi all,

Let me start, as always, by thanking the efforts of all the contributors
who contributed to this release, especially those who jumped on the issues
found in RC0.

I've prepared RC1 for Apache Hadoop 3.0.0. This release incorporates 302
fixed JIRAs since the previous 3.0.0-beta1 release.

You can find the artifacts here:

http://home.apache.org/~wang/3.0.0-RC1/

I've done the traditional testing of building from the source tarball and
running a Pi job on a single node cluster. I also verified that the shaded
jars are not empty.

Found one issue that create-release (probably due to the mvn deploy change)
didn't sign the artifacts, but I fixed that by calling mvn one more time.
Available here:

https://repository.apache.org/content/repositories/orgapachehadoop-1075/

This release will run the standard 5 days, closing on Dec 13th at 12:31pm
Pacific. My +1 to start.

Best,
Andrew


Re: [VOTE] Release Apache Hadoop 3.0.0 RC0

2017-12-08 Thread Andrew Wang
FYI that we got our last blocker in today, so I'm currently rolling RC1.
Stay tuned!

On Thu, Nov 30, 2017 at 8:32 AM, Allen Wittenauer 
wrote:

>
> > On Nov 30, 2017, at 1:07 AM, Rohith Sharma K S <
> rohithsharm...@apache.org> wrote:
> >
> >
> > >. If ATSv1 isn’t replaced by ATSv2, then why is it marked deprecated?
> > Ideally it should not be. Can you point out where it is marked as
> deprecated? If it is in historyserver daemon start, that change made very
> long back when timeline server added.
>
>
> Ahh, I see where all the problems lie.  No one is paying attention to the
> deprecation message because it’s kind of oddly worded:
>
> * It really means “don’t use ‘yarn historyserver’ use ‘yarn
> timelineserver’ ”
> * ‘yarn historyserver’ was removed from the documentation in 2.7.0
> * ‘yarn historyserver’ doesn’t appear in the yarn usage output
> * ‘yarn timelineserver’ runs the exact same class
>
> There’s no reason for ‘yarn historyserver’ to exist in 3.x.  Just run
> ‘yarn timelineserver’ instead.
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


2017-12-01 Hadoop 3 release status update

2017-12-01 Thread Andrew Wang
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

2017-12-01

Haven't written one of these in a month. I had high hopes for RC0, but it
failed due to  HADOOP-15058
 -
create-release
site build outputs dummy shaded jars due to skipShade PATCH AVAILABLE  which
Sangjin found, and then a number of other blockers were found shortly after
that.

We're back to blocker burndown. My new (realistic) goal is to get 3.0.0 out
before Christmas. We could always use more help with reviews; most things
are patch available.



Highlights:

Red flags:

Previously tracked blockers that have been resolved or dropped:

GA blockers:

   - HDFS-12840
    - Creating
   a replicated file in a EC zone does not correctly serialized in
EditLogs PATCH
   AVAILABLE : Has gone through several rounds of review, looks close.
   - HADOOP-15080
    - Cat-X
   transitive dependency on org.json library via json-lib OPEN : New issue,
   waiting on LEGAL but we might need to pull this entire feature.
   - HADOOP-15059
    - 3.0
   deployment cannot work with old version MR tar ball which break rolling
   upgrade PATCH AVAILABLE : Has gone through some review and has a +1 from
   Daryn, could use confirmation from Vinod and others
   - HADOOP-15058
   
- create-release
   site build outputs dummy shaded jars due to skipShade PATCH AVAILABLE :
   Needs review, asked Allen but might need someone else to help.

GA criticals:

   - HDFS-12872
    - EC
   Checksum broken when BlockAccessToken is enabled PATCH AVAILABLE : Patch
   needs review
   - YARN-7381
    - Enable
   the configuration: yarn.nodemanager.log-container-debug-info.enabled PATCH
   AVAILABLE : Has gone through some review and Wangda +1'd, could use
   confirmation from Ray and others

Features merged for GA:

   - Erasure coding
  - Testing is still ongoing at Cloudera, which resulted in  HDFS-12840
  
- Creating
  a replicated file in a EC zone does not correctly serialized in EditLogs
   PATCH AVAILABLE  and HDFS-12872
   - EC
  Checksum broken when BlockAccessToken is enabled PATCH AVAILABLE .
   - Classpath isolation (HADOOP-11656)
   - No change.
   - Compat guide (HADOOP-13714
   )
  - We slid a couple more changes into 3.0.0 after RC0 was cancelled,
  making this work more complete.
   - TSv2 alpha 2
   - No change.
   - API-based scheduler configuration  YARN-5734
    - OrgQueue
   for easy CapacityScheduler queue configuration management RESOLVED
  - No change.
   - HDFS router-based configuration  HDFS-10467
    -
Router-based
   HDFS federation RESOLVED
  - No change.
   - Resource types  YARN-3926
    - Extend
   the YARN resource model for easier resource-type management and profiles
   RESOLVED
  - Had some post-merge issues that were resolved, nothing outstanding.


Re: [VOTE] Release Apache Hadoop 3.0.0 RC0

2017-11-21 Thread Andrew Wang
Hi folks,

Thanks again for the testing help with the RC. Here's our dashboard for the
3.0.0 release:

https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12329849

Right now we're tracking three blockers:

* HADOOP-15058 is the create-release fix, I just put up a patch which needs
reviews. It's the worst timing, but I'm hoping Allen could give it a quick
sanity check.
* HADOOP-15059 is the MR rolling upgrade issue that Junping found, needs
triage and an assignee. I asked Ray to look at what we've done with our
existing rolling upgrade testing, since it does run an MR job.
* HDFS-12480 is an EC issue that Eddy would like to get in if we're rolling
another RC, looks close.

Is there anything else from this thread that needs to be addressed? I rely
on the dashboard to track blockers, so please file a JIRA and prioritize if
so.

Best,
Andrew



On Tue, Nov 21, 2017 at 2:08 PM, Vrushali C  wrote:

> Hi Vinod,
>
> bq. (b) We need to figure out if this V1 TimelineService should even be
> support given ATSv2.
>
> Yes, I am following this discussion. Let me chat with Rohith and Varun
> about this and we will respond on this thread. As such, my preliminary
> thoughts are that we should continue to support Timeline Service V1 till we
> have the detailed entity level ACLs in V2 and perhaps also a proposal
> around upgrade/migration paths from TSv1 to TSv2.
>
> But in any case, we do need to work towards phasing out Timeline Service
> V1.
>
> thanks
> Vrushali
>
>
> On Tue, Nov 21, 2017 at 1:16 PM, Vinod Kumar Vavilapalli <
> vino...@apache.org
> > wrote:
>
> > >> - $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start historyserver doesn't
> > even work. Not just deprecated in favor of timelineserver as was
> advertised.
> > >
> > >   This works for me in trunk and the bash code doesn’t appear to
> > have changed in a very long time.  Probably something local to your
> > install.  (I do notice that the deprecation message says “starting” which
> > is awkward when the stop command is given though.)  Also: is the
> > deprecation message even true at this point?
> >
> >
> > Sorry, I mischaracterized the problem.
> >
> > The real issue is that I cannot use this command line when the MapReduce
> > JobHistoryServer is already started on the same machine.
> >
> > ~/tmp/yarn$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start historyserver
> > WARNING: Use of this script to start YARN daemons is deprecated.
> > WARNING: Attempting to execute replacement "yarn --daemon start" instead.
> > DEPRECATED: Use of this command to start the timeline server is
> deprecated.
> > Instead use the timelineserver command for it.
> > Starting the History Server anyway...
> > historyserver is running as process 86156.  Stop it first.
> >
> > So, it looks like in shell-scripts, there can ever be only one daemon of
> a
> > given name, irrespective of which daemon scripts are invoked.
> >
> > We need to figure out two things here
> >  (a) The behavior of this command. Clearly, it will conflict with the
> > MapReduce JHS - only one of them can be started on the same node.
> >  (b) We need to figure out if this V1 TimelineService should even be
> > support given ATSv2.
> >
> > @Vrushani / @Rohith / @Varun Saxena et.al, if you are watching, please
> > comment on (b).
> >
> > Thanks
> > +Vinod
>


Re: [VOTE] Release Apache Hadoop 3.0.0 RC0

2017-11-21 Thread Andrew Wang
On Mon, Nov 20, 2017 at 11:33 PM, Allen Wittenauer  wrote:

>
> The original release script and instructions broke the build up
> into three or so steps. When I rewrote it, I kept that same model. It’s
> probably time to re-think that.  In particular, it should probably be one
> big step that even does the maven deploy.  There’s really no harm in doing
> that given that there is still a manual step to release the deployed jars
> into the production area.
>
> We just need need to:
>
> a) add an option to do deploy instead of just install.  if c-r is in asf
> mode, always activate deploy
> b) pull the maven settings.xml file (and only the maven settings file… we
> don’t want the repo!) into the docker build environment
> c) consolidate the mvn steps
>
> This has the added benefit of greatly speeding up the build by
> removing several passes.
>
> Probably not a small change, but I’d have to look at the code.
> I’m on a plane tomorrow morning though.
>
> I refreshed my memory on this yesterday, and came to a similar conclusion.
+1 to this approach. It'd also solve our current issue, if we build the
site and site tarball after the deploy and building the src/bin tarballs.

So, regarding this current issue, I think our options are:

* The c-r changes to do "mvn clean deploy", create the src and bin
tarballs, then "mvn site" at the end.
* Turn off JDiff in the site build

I'd like to get off of JDiff since both the project and our usage of it
isn't maintained, but that might be a more controversial action than
changing create-release.

I filed HADOOP-15058 to dig further into this issue.

Best,
Andrew


[jira] [Created] (HADOOP-15058) create-release site build outputs dummy shaded jars due to skipShade

2017-11-21 Thread Andrew Wang (JIRA)
Andrew Wang created HADOOP-15058:


 Summary: create-release site build outputs dummy shaded jars due 
to skipShade
 Key: HADOOP-15058
 URL: https://issues.apache.org/jira/browse/HADOOP-15058
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Apache Hadoop 2.8.3 Release Plan

2017-11-21 Thread Andrew Wang
The Aliyun OSS code isn't a small improvement. If you look at Sammi's
comment
<https://issues.apache.org/jira/browse/HADOOP-14964?focusedCommentId=16247085=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16247085>,
it's
a 17 patch series that is being backported in one shot. What we're talking
about is equivalent to merging a feature branch in a maintenance release. I
see that Kai and Chris are having a discussion about the dependency
changes, which indicates this is not a zero-risk change either. We really
should not be changing dependency versions in a maintenance unless it's
because of a bug.

It's unfortunate from a timing perspective that this missed 2.9.0, but I
still think it should wait for the next minor. Merging a feature into a
maintenance release sets the wrong precedent.

Best,
Andrew

On Tue, Nov 21, 2017 at 1:08 AM, Junping Du <j...@hortonworks.com> wrote:

> Thanks Kai for calling out this feature/improvement for attention and
> Andrew for comments.
>
>
> While I agree that maintenance release should focus on important bug fix
> only, I doubt we have strict rules to disallow any features/improvements to
> land on maint release especially when those are small footprint or low
> impact on existing code/features. In practice, we indeed has 77 new
> features/improvements in latest 2.7.3 and 2.7.4 release.
>
>
> Back to HADOOP-14964, I did a quick check and it looks like case here
> belongs to self-contained improvement that has very low impact on existing
> code base, so I am OK with the improvement get landed on branch-2.8 in case
> it is well reviewed and tested.
>
>
> However, as RM of branch-2.8, I have two concerns to accept it in our
> 2.8.3 release:
>
> 1. Timing - as I mentioned in beginning, the main purpose of 2.8.3 are for
> several critical bug fixes and we should target to release it very soon -
> my current plan is to cut RC out within this week inline with waiting
> for 3.0.0 vote closing. Can this improvement be well tested against
> branch-2.8.3 within this strictly timeline? It seems a bit rush unless we
> have strong commitment on test plan and activities in such a tight time.
>
>
> 2. Upgrading - I haven't heard we settle down the plan of releasing this
> feature in 2.9.1 release - though I saw some discussions are going on
> at HADOOP-14964. Assume 2.8.3 is released ahead of 2.9.1 and it includes
> this improvement, then users consuming this feature/improvement have no 2.9
> release to upgrade or forcefully upgrade with regression. We may need a
> better upgrade story here.
>
>
> Pls let me know what you think. Thanks!
>
>
>
> Thanks,
>
>
> Junping
>
>
> --
> *From:* Andrew Wang <andrew.w...@cloudera.com>
> *Sent:* Monday, November 20, 2017 10:22 PM
> *To:* Zheng, Kai
> *Cc:* Junping Du; common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org;
> mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org
> *Subject:* Re: Apache Hadoop 2.8.3 Release Plan
>
> I'm against including new features in maintenance releases, since they're
> meant to be bug-fix only.
>
> If we're struggling with being able to deliver new features in a safe and
> timely fashion, let's try to address that, not overload the meaning of
> "maintenance release".
>
> Best,
> Andrew
>
> On Mon, Nov 20, 2017 at 5:20 PM, Zheng, Kai <kai.zh...@intel.com> wrote:
>
>> Hi Junping,
>>
>> Thank you for making 2.8.2 happen and now planning the 2.8.3 release.
>>
>> I have an ask, is it convenient to include the back port work for OSS
>> connector module? We have some Hadoop users that wish to have it by default
>> for convenience, though in the past they used it by back porting
>> themselves. I have raised this and got thoughts from Chris and Steve. Looks
>> like this is more wanted for 2.9 but I wanted to ask again here for broad
>> feedback and thoughts by this chance. The back port patch is available for
>> 2.8 and the one for branch-2 was already in. IMO, 2.8.x is promising as we
>> can see some shift from 2.7.x, hence it's worth more important features and
>> efforts. How would you think? Thanks!
>>
>> https://issues.apache.org/jira/browse/HADOOP-14964
>>
>> Regards,
>> Kai
>>
>> -Original Message-
>> From: Junping Du [mailto:j...@hortonworks.com]
>> Sent: Tuesday, November 14, 2017 9:02 AM
>> To: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org;
>> mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org
>> Subject: Apache Hadoop 2.8.3 Release Plan
>>
>> Hi,
>> We have several important fixes get landed on branch-2.8 and I would
>> li

Re: Apache Hadoop 2.8.3 Release Plan

2017-11-20 Thread Andrew Wang
>
>
> >> If we're struggling with being able to deliver new features in a safe
> and timely fashion, let's try to address that...
>
> This is interesting. Do you aware any means to do that? Thanks!
>
> I've mentioned this a few times on the lists before, but our biggest gap
in keeping branches releasable is automated integration testing.

I think we try to put too much into our minor releases, and features arrive
before they're baked. Having automated integration testing helps with this.
When we were finally able to turn on CI for the 3.0.0 release branch, we
started finding bugs much sooner after they were introduced, which made it
easier to revert before too much other code was built on top. The early
alphas felt Sisyphean at times, with bugs being introduced faster than we
could uncover and fix them.

A smaller example would be release validation. I've long wanted a nightly
Jenkins job that makes an RC and runs some basic checks on it. We end up
rolling extra RCs for small stuff that could have been caught earlier.

Best,
Andrew


Re: Apache Hadoop 2.8.3 Release Plan

2017-11-20 Thread Andrew Wang
I'm against including new features in maintenance releases, since they're
meant to be bug-fix only.

If we're struggling with being able to deliver new features in a safe and
timely fashion, let's try to address that, not overload the meaning of
"maintenance release".

Best,
Andrew

On Mon, Nov 20, 2017 at 5:20 PM, Zheng, Kai  wrote:

> Hi Junping,
>
> Thank you for making 2.8.2 happen and now planning the 2.8.3 release.
>
> I have an ask, is it convenient to include the back port work for OSS
> connector module? We have some Hadoop users that wish to have it by default
> for convenience, though in the past they used it by back porting
> themselves. I have raised this and got thoughts from Chris and Steve. Looks
> like this is more wanted for 2.9 but I wanted to ask again here for broad
> feedback and thoughts by this chance. The back port patch is available for
> 2.8 and the one for branch-2 was already in. IMO, 2.8.x is promising as we
> can see some shift from 2.7.x, hence it's worth more important features and
> efforts. How would you think? Thanks!
>
> https://issues.apache.org/jira/browse/HADOOP-14964
>
> Regards,
> Kai
>
> -Original Message-
> From: Junping Du [mailto:j...@hortonworks.com]
> Sent: Tuesday, November 14, 2017 9:02 AM
> To: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org;
> mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org
> Subject: Apache Hadoop 2.8.3 Release Plan
>
> Hi,
> We have several important fixes get landed on branch-2.8 and I would
> like to cut off branch-2.8.3 now to start 2.8.3 release work.
> So far, I don't see any pending blockers on 2.8.3, so my current plan
> is to cut off 1st RC of 2.8.3 in next several days:
>  -  For all coming commits to land on branch-2.8, please mark the
> fix version as 2.8.4.
>  -  If there is a really important fix for 2.8.3 and getting
> closed, please notify me ahead before landing it on branch-2.8.3.
> Please let me know if you have any thoughts or comments on the plan.
>
> Thanks,
>
> Junping
> 
> From: dujunp...@gmail.com  on behalf of 俊平堵 <
> junping...@apache.org>
> Sent: Friday, October 27, 2017 3:33 PM
> To: gene...@hadoop.apache.org
> Subject: [ANNOUNCE] Apache Hadoop 2.8.2 Release.
>
> Hi all,
>
> It gives me great pleasure to announce that the Apache Hadoop
> community has voted to release Apache Hadoop 2.8.2, which is now available
> for download from Apache mirrors[1]. For download instructions please refer
> to the Apache Hadoop Release page [2].
>
> Apache Hadoop 2.8.2 is the first GA release of Apache Hadoop 2.8 line and
> our newest stable release for entire Apache Hadoop project. For major
> changes incuded in Hadoop 2.8 line, please refer Hadoop 2.8.2 main page[3].
>
> This release has 315 resolved issues since previous 2.8.1 release with
> following
> breakdown:
>- 91 in Hadoop Common
>- 99 in HDFS
>- 105 in YARN
>- 20 in MapReduce
> Please read the log of CHANGES[4] and RELEASENOTES[5] for more details.
>
> The release news is posted on the Hadoop website too, you can go to the
> downloads section directly [6].
>
> Thank you all for contributing to the Apache Hadoop release!
>
>
> Cheers,
>
> Junping
>
>
> [1] http://www.apache.org/dyn/closer.cgi/hadoop/common
>
> [2] http://hadoop.apache.org/releases.html
>
> [3] http://hadoop.apache.org/docs/r2.8.2/index.html
>
> [4]
> http://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/
> hadoop-common/release/2.8.2/CHANGES.2.8.2.html
>
> [5]
> http://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/
> hadoop-common/release/2.8.2/RELEASENOTES.2.8.2.html
>
> [6] http://hadoop.apache.org/releases.html#Download
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Release Apache Hadoop 3.0.0 RC0

2017-11-20 Thread Andrew Wang
On Mon, Nov 20, 2017 at 9:59 PM, Sangjin Lee <sj...@apache.org> wrote:

>
> On Mon, Nov 20, 2017 at 9:46 PM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
>
>> Thanks for the spot Sangjin. I think this bug introduced in
>> create-release by HADOOP-14835. The multi-pass maven build generates these
>> dummy client jars during the site build since skipShade is specified.
>>
>> This might be enough to cancel the RC. Thoughts?
>>
>
> IMO yes. This was one of the key features mentioned in the 3.0 release
> notes. I appreciate your effort for the release Andrew!
>
>
Yea, I was leaning that way too. Let's cancel this RC. I hope to have a new
RC up tomorrow. With the upcoming holidays, we'll probably have to extend
the vote until mid-next week.

I'm also worried about the "mvn deploy" step since I thought it was safe to
specify skipShade there too. I'll check that as well.

Best,
Andrew


Re: [VOTE] Release Apache Hadoop 3.0.0 RC0

2017-11-20 Thread Andrew Wang
Thanks for the thorough review Vinod, some inline responses:

*Issues found during testing*
>
> Major
>  - The previously supported way of being able to use different tar-balls
> for different sub-modules is completely broken - common and HDFS tar.gz are
> completely empty.
>

Is this something people use? I figured that the sub-tarballs were a relic
from the project split, and nowadays Hadoop is one project with one release
tarball. I actually thought about getting rid of these extra tarballs since
they add extra overhead to a full build.


>  - Cannot enable new UI in YARN because it is under a non-default
> compilation flag. It should be on by default.
>

The yarn-ui profile has always been off by default, AFAIK. It's documented
to turn it on in BUILDING.txt for release builds, and we do it in
create-release.

IMO not a blocker. I think it's also more of a dev question (do we want to
do this on every YARN build?) than a release one.


>  - One decommissioned node in YARN ResourceManager UI always appears to
> start with, even when there are no NodeManagers that are started yet:
> Info :-1, DECOMMISSIONED, null rack. It shows up only in the UI though,
> not in the CLI node -list
>

Is this a blocker? Could we get a JIRA?

Thanks,
Andrew


Re: [VOTE] Release Apache Hadoop 3.0.0 RC0

2017-11-20 Thread Andrew Wang
Thanks for the spot Sangjin. I think this bug introduced in create-release
by HADOOP-14835. The multi-pass maven build generates these dummy client
jars during the site build since skipShade is specified.

This might be enough to cancel the RC. Thoughts?

Best,
Andrew

On Mon, Nov 20, 2017 at 7:51 PM, Sangjin Lee <sj...@apache.org> wrote:

> I checked the client jars that are supposed to contain shaded
> dependencies, and they don't look quite right:
>
> $ tar -tzvf hadoop-3.0.0.tar.gz | grep hadoop-client-api-3.0.0.jar
> -rw-r--r--  0 andrew andrew44531 Nov 14 11:53
> hadoop-3.0.0/share/hadoop/client/hadoop-client-api-3.0.0.jar
> $ tar -tzvf hadoop-3.0.0.tar.gz | grep hadoop-client-runtime-3.0.0.jar
> -rw-r--r--  0 andrew andrew45533 Nov 14 11:53
> hadoop-3.0.0/share/hadoop/client/hadoop-client-runtime-3.0.0.jar
> $ tar -tzvf hadoop-3.0.0.tar.gz | grep hadoop-client-minicluster-3.0.0.jar
> -rw-r--r--  0 andrew andrew47015 Nov 14 11:53
> hadoop-3.0.0/share/hadoop/client/hadoop-client-minicluster-3.0.0.jar
>
> When I look at what's inside those jar, they only seem to include
> pom-related files with no class files. Am I missing something?
>
> When I build from the source with -Pdist, I do get much bigger jars:
> total 113760
> -rw-r--r--  1 sangjinlee  120039211  17055399 Nov 20 17:17
> hadoop-client-api-3.0.0.jar
> -rw-r--r--  1 sangjinlee  120039211  20451447 Nov 20 17:19
> hadoop-client-minicluster-3.0.0.jar
> -rw-r--r--  1 sangjinlee  120039211  20730866 Nov 20 17:18
> hadoop-client-runtime-3.0.0.jar
>
> Sangjin
>
> On Mon, Nov 20, 2017 at 5:52 PM, Sangjin Lee <sj...@apache.org> wrote:
>
>>
>>
>> On Mon, Nov 20, 2017 at 5:26 PM, Vinod Kumar Vavilapalli <
>> vino...@apache.org> wrote:
>>
>>> Thanks for all the push, Andrew!
>>>
>>> Looking at the RC. Went through my usual check-list. Here's my summary.
>>> Will cast my final vote after comparing and validating my findings with
>>> others.
>>>
>>> Verification
>>>
>>>  - [Check] Successful recompilation from source tar-ball
>>>  - [Check] Signature verification
>>>  - [Check] Generating dist tarballs from source tar-ball
>>>  - [Check] Testing
>>> -- Start NN, DN, RM, NM, JHS, Timeline Service
>>> -- Ran dist-shell example, MR sleep, wordcount, randomwriter, sort,
>>> grep, pi
>>> -- Tested CLIs to print nodes, apps etc and also navigated UIs
>>>
>>> Issues found during testing
>>>
>>> Major
>>>  - The previously supported way of being able to use different tar-balls
>>> for different sub-modules is completely broken - common and HDFS tar.gz are
>>> completely empty.
>>>  - Cannot enable new UI in YARN because it is under a non-default
>>> compilation flag. It should be on by default.
>>>  - One decommissioned node in YARN ResourceManager UI always appears to
>>> start with, even when there are no NodeManagers that are started yet:  Info
>>> :-1, DECOMMISSIONED, null rack. It shows up only in the UI though, not
>>> in the CLI node -list
>>>
>>> Minor
>>>  - resourcemanager-metrics.out is going into current directory instead
>>> of log directory
>>>  - $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start historyserver doesn't
>>> even work. Not just deprecated in favor of timelineserver as was advertised.
>>>  - Spurious warnings on CLI
>>> 17/11/20 17:07:34 INFO conf.Configuration:
>>> resource-types.xml not found
>>> 17/11/20 17:07:34 INFO resource.ResourceUtils: Unable to
>>> find 'resource-types.xml'.
>>>
>>> Side notes
>>>
>>>  - When did we stop putting CHANGES files into the source artifacts?
>>>  - Even after "mvn install"ing once, shading is repeated again and again
>>> for every new 'mvn install' even though there are no source changes - we
>>> should see how this can be avoided.
>>>  - Compatibility notes
>>> -- NM's env list is curtailed unlike in 2.x (For e.g,
>>> HADOOP_MAPRED_HOME is not automatically inherited. Correct behavior)
>>> -- Sleep is moved from hadoop-mapreduce-client-jobclient-3.0.0.jar
>>> into hadoop-mapreduce-client-jobclient-3.0.0-tests.jar
>>>
>>
>> Sleep has always been in the jobclient test jar as long as I can
>> remember, so it's not new for 3.0.
>>
>>
>>>
>>> Thanks
>>> +Vinod
>>>
>>> > On Nov 14, 2017, at 1:34 PM, Andrew Wang <andrew.w...@cloudera.com>
>>> wrote:
>>> >
>>> > Hi folks,
>>> >
>>> > Thanks as always to the many, many contributors who helped with this
>>> > release. I've created RC0 for Apache Hadoop 3.0.0. The artifacts are
>>> > available here:
>>> >
>>> > http://people.apache.org/~wang/3.0.0-RC0/
>>> >
>>> > This vote will run 5 days, ending on Nov 19th at 1:30pm Pacific.
>>> >
>>> > 3.0.0 GA contains 291 fixed JIRA issues since 3.0.0-beta1. Notable
>>> > additions include the merge of YARN resource types, API-based
>>> configuration
>>> > of the CapacityScheduler, and HDFS router-based federation.
>>> >
>>> > I've done my traditional testing with a pseudo cluster and a Pi job.
>>> My +1
>>> > to start.
>>> >
>>> > Best,
>>> > Andrew
>>>
>>>
>>
>


Re: [VOTE] Release Apache Hadoop 3.0.0 RC0

2017-11-17 Thread Andrew Wang
Hi Arpit,

I agree the timing is not great here, but extending it to meaningfully
avoid the holidays would mean extending it an extra week (e.g. to the
29th). We've been coordinating with ASF PR for that Tuesday, so I'd really,
really like to get the RC out before then.

In terms of downstream testing, we've done extensive integration testing
with downstreams via the alphas and betas, and we have continuous
integration running at Cloudera against branch-3.0. Because of this, I have
more confidence in our integration for 3.0.0 than most Hadoop releases.

Is it meaningful to extend to say, the 21st, which provides for a full week
of voting?

Best,
Andrew

On Fri, Nov 17, 2017 at 1:27 PM, Arpit Agarwal <aagar...@hortonworks.com>
wrote:

> Hi Andrew,
>
> Thank you for your hard work in getting us to this step. This is our first
> major GA release in many years.
>
> I feel a 5-day vote window ending over the weekend before thanksgiving may
> not provide sufficient time to evaluate this RC especially for downstream
> components.
>
> Would you please consider extending the voting deadline until a few days
> after the thanksgiving holiday? It would be a courtesy to our broader
> community and I see no harm in giving everyone a few days to evaluate it
> more thoroughly.
>
> On a lighter note, your deadline is also 4 minutes short of the required 5
> days. :)
>
> Regards,
> Arpit
>
>
>
> On 11/14/17, 1:34 PM, "Andrew Wang" <andrew.w...@cloudera.com> wrote:
>
> Hi folks,
>
> Thanks as always to the many, many contributors who helped with this
> release. I've created RC0 for Apache Hadoop 3.0.0. The artifacts are
> available here:
>
> http://people.apache.org/~wang/3.0.0-RC0/
>
> This vote will run 5 days, ending on Nov 19th at 1:30pm Pacific.
>
> 3.0.0 GA contains 291 fixed JIRA issues since 3.0.0-beta1. Notable
> additions include the merge of YARN resource types, API-based
> configuration
> of the CapacityScheduler, and HDFS router-based federation.
>
> I've done my traditional testing with a pseudo cluster and a Pi job.
> My +1
> to start.
>
> Best,
> Andrew
>
>
>


Re: [VOTE] Release Apache Hadoop 3.0.0 RC0

2017-11-17 Thread Andrew Wang
Thanks for the spot, normally create-release spits those out. I uploaded
asc and mds for the release artifacts.

Best,
Andrew

On Thu, Nov 16, 2017 at 11:33 PM, Akira Ajisaka <aajis...@apache.org> wrote:

> Hi Andrew,
>
> Signatures are missing. Would you upload them?
>
> Thanks,
> Akira
>
>
> On 2017/11/15 6:34, Andrew Wang wrote:
>
>> Hi folks,
>>
>> Thanks as always to the many, many contributors who helped with this
>> release. I've created RC0 for Apache Hadoop 3.0.0. The artifacts are
>> available here:
>>
>> http://people.apache.org/~wang/3.0.0-RC0/
>>
>> This vote will run 5 days, ending on Nov 19th at 1:30pm Pacific.
>>
>> 3.0.0 GA contains 291 fixed JIRA issues since 3.0.0-beta1. Notable
>> additions include the merge of YARN resource types, API-based
>> configuration
>> of the CapacityScheduler, and HDFS router-based federation.
>>
>> I've done my traditional testing with a pseudo cluster and a Pi job. My +1
>> to start.
>>
>> Best,
>> Andrew
>>
>>


Re: [DISCUSS] A final minor release off branch-2?

2017-11-15 Thread Andrew Wang
Hi Junping,

On Wed, Nov 15, 2017 at 1:37 AM, Junping Du  wrote:

> Thanks Vinod to bring up this discussion, which is just in time.
>
> I agree with most responses that option C is not a good choice as our
> community bandwidth is precious and we should focus on very limited
> mainstream branches to develop, test and deployment. Of course, we should
> still follow Apache way to allow any interested committer for rolling up
> his/her own release given specific requirement over the mainstream releases.
>
> I am not biased on option A or B (I will discuss this later), but I think
> a bridge release for upgrading to and back from 3.x is very necessary.
> The reasons are obviously:
> 1. Given lesson learned from previous experience of migration from 1.x to
> 2.x, no matter how careful we tend to be, there is still chance that some
> level of compatibility (source, binary, configuration, etc.) get broken for
> the migration to new major release. Some of these incompatibilities can
> only be identified in runtime after GA release with widely deployed in
> production cluster - we have tons of downstream projects and numerous
> configurations and we cannot cover them all from in-house deployment and
> test.
>

Source and binary compatibility are not required for 3.0.0. It's a new
major release, and there are known, documented incompatibilities in this
regard.

That said, we've done far, far more in this regard compared to previous
major or minor releases. We've compiled all of CDH against Hadoop 3 and run
our suite of system tests for the platform. We've been testing in this way
since 3.0.0-alpha1 and found and fixed plenty of source and binary
compatibility issues during the alpha and beta process. Many of these fixes
trickled down into 2.8 and 2.9.

>
> 2. From recent classpath isolation work, I was surprised to find out that
> many of our downstream projects (HBase, Tez, etc.) are still consuming many
> non-public, server side APIs of Hadoop, not saying the projects/products
> outside of hadoop ecosystem. Our API compatibility test does not (and
> should not) cover these cases and situations. We can claim that new major
> release shouldn't be responsible for these private API changes. But given
> the possibility of breaking existing applications in some way, users could
> be very hesitated to migrate to 3.x release if there is no safe solution to
> roll back.
>

This is true for 2.x releases as well. Similar to the previous answer,
we've compiled all of CDH against Hadoop 3, providing a much higher level
of assurance even compared to 2.x releases.

>
> 3. Beside incompatibilities, there is also possible to have performance
> regressions (lower throughput, higher latency, slower job running, bigger
> memory footprint or even memory leaking, etc.) for new hadoop releases.
> While the performance impact of migration (if any) could be neglectable to
> some users, other users could be very sensitive and wish to roll back if it
> happens on their production cluster.
>
> Yes, bugs exist. I won't claim that 3.0.0 is bug-free. All new releases
can potentially introduce new bugs.

However, I don't think rollback is the solution. In my experience, users
rarely rollback since it's so disruptive and causes data loss. It's much
more common that they patch and upgrade. With that in mind, I'd rather we
spend our effort on making 3.0.x high-quality vs. making it easier to
rollback.

The root of my concern in announcing a "bridge release" is that it
discourages users from upgrading to 3.0.0 until a bridge release is out. I
strongly believe the level of quality provided by 3.0.0 is at least equal
to new 2.x minor releases, given our extended testing and integration
process, and we don't have bridge releases for 2.x.

This is why I asked for a list of known issues with 2.x -> 3.0 upgrades,
that would necessitate a bridge release. Arun raised a concern about NM
rollback. Are there any other *known* issues?

Best,
Andrew


Re: [DISCUSS] A final minor release off branch-2?

2017-11-14 Thread Andrew Wang
To follow up on my earlier email, I don't think there's need for a bridge
release given that we've successfully tested rolling upgrade from 2.x to
3.0.0. I expect we'll keep making improvements to smooth over any
additional incompatibilities found, but there isn't a requirement that a
user upgrade to a bridge release before upgrading to 3.0.

Otherwise, I don't have a strong opinion about when to discontinue branch-2
releases. Historically, a release line is maintained until interest in it
wanes. If the maintainers are taking care of the backports, it's not much
work for the rest of us to vote on the RCs.

Best,
Andrew

On Mon, Nov 13, 2017 at 4:19 PM, Wangda Tan  wrote:

> Thanks Vinod for staring this,
>
> I'm also leaning towards the plan (A):
>
>
>
>
> * (A)-- Make 2.9.x the last minor release off branch-2-- Have a
> maintenance release that bridges 2.9 to 3.x-- Continue to make more
> maintenance releases on 2.8 and 2.9 as necessary*
>
> The only part I'm not sure is having a separate bridge release other than
> 3.x.
>
> For the bridge release, Steve's suggestion sounds more doable:
>
> ** 3.1+ for new features*
> ** fixes to 3.0.x &, where appropriate, 2.9, esp feature stabilisation*
> ** whoever puts their hand up to do 2.x releases deserves support in
> testing *
> ** If someone makes a really strong case to backport a feature from 3.x to
> branch-2 and its backwards compatible, I'm not going to stop them. It's
> just once 3.0 is out and a 3.1 on the way, it's less compelling*
>
> This makes community can focus on 3.x releases and fill whatever gaps of
> migrating from 2.x to 3.x.
>
> Best,
> Wangda
>
>
> On Wed, Nov 8, 2017 at 3:57 AM, Steve Loughran 
> wrote:
>
>>
>> > On 7 Nov 2017, at 19:08, Vinod Kumar Vavilapalli 
>> wrote:
>> >
>> >
>> >
>> >
>> >> Frankly speaking, working on some bridging release not targeting any
>> feature isn't so attractive to me as a contributor. Overall, the final
>> minor release off branch-2 is good, we should also give 3.x more time to
>> evolve and mature, therefore it looks to me we would have to work on two
>> release lines meanwhile for some time. I'd like option C), and suggest we
>> focus on the recent releases.
>> >
>> >
>> >
>> > Answering this question is also one of the goals of my starting this
>> thread. Collectively we need to conclude if we are okay or not okay with no
>> longer putting any new feature work in general on the 2.x line after 2.9.0
>> release and move over our focus into 3.0.
>> >
>> >
>> > Thanks
>> > +Vinod
>> >
>>
>>
>> As a developer of new features (e.g the Hadoop S3A committers), I'm
>> mostly already committed to targeting 3.1; the code in there to deal with
>> failures and retries has unashamedly embraced java 8 lambda-expressions in
>> production code: backporting that is going to be traumatic in terms of
>> IDE-assisted code changes and the resultant diff in source between branch-2
>> & trunk. What's worse, its going to be traumatic to test as all my JVMs
>> start with an 8 at the moment, and I'm starting to worry about whether I
>> should bump a windows VM up to Java 9 to keep an eye on Akira's work there.
>> Currently the only testing I'm really doing on java 7 is yetus branch-2 &
>> internal test runs.
>>
>>
>> 3.0 will be out the door, and we can assume that CDH will ship with it
>> soon (*)  which will allow for a rapid round trip time on inevitable bugs:
>> 3.1 can be the release with compatibility tuned, those reported issues
>> addressed. It's certainly where I'd like to focus.
>>
>>
>> At the same time: 2.7.2-2.8.x are the broadly used versions, we can't
>> just say "move to 3.0" & expect everyone to do it, not given we have
>> explicitly got backwards-incompatible changes in. I don't seen people
>> rushing to do it until the layers above are all qualified (HBase, Hive,
>> Spark, ...). Which means big users of 2.7/2,8 won't be in a rush to move
>> and we are going to have to maintain 2.x for a while, including security
>> patches for old versions. One issue there: what if a patch (such as bumping
>> up a JAR version) is incompatible?
>>
>> For me then
>>
>> * 3.1+ for new features
>> * fixes to 3.0.x &, where appropriate, 2.9, esp feature stabilisation
>> * whoever puts their hand up to do 2.x releases deserves support in
>> testing 
>> * If someone makes a really strong case to backport a feature from 3.x to
>> branch-2 and its backwards compatible, I'm not going to stop them. It's
>> just once 3.0 is out and a 3.1 on the way, it's less compelling
>>
>> -Steve
>>
>> Note: I'm implicitly assuming a timely 3.1 out the door with my work
>> included, all all issues arriving from 3,0 fixed. We can worry when 3.1
>> ships whether there's any benefit in maintaining a 3.0.x, or whether it's
>> best to say "move to 3.1"
>>
>>
>>
>> (*) just a guess based the effort & test reports of Andrew & others
>>
>>
>> 

[VOTE] Release Apache Hadoop 3.0.0 RC0

2017-11-14 Thread Andrew Wang
Hi folks,

Thanks as always to the many, many contributors who helped with this
release. I've created RC0 for Apache Hadoop 3.0.0. The artifacts are
available here:

http://people.apache.org/~wang/3.0.0-RC0/

This vote will run 5 days, ending on Nov 19th at 1:30pm Pacific.

3.0.0 GA contains 291 fixed JIRA issues since 3.0.0-beta1. Notable
additions include the merge of YARN resource types, API-based configuration
of the CapacityScheduler, and HDFS router-based federation.

I've done my traditional testing with a pseudo cluster and a Pi job. My +1
to start.

Best,
Andrew


Re: Heads up: branching branch-3.0.0 for GA

2017-11-14 Thread Andrew Wang
Branching is complete. Please use the 3.0.1 fix version for further commits
to branch-3.0. Ping me if you want something in branch-3.0.0 since I'm
rolling RC0 now.

On Tue, Nov 14, 2017 at 11:08 AM, Andrew Wang <andrew.w...@cloudera.com>
wrote:

> Hi folks,
>
> We've resolved all the blockers for 3.0.0 and the release notes and
> changelog look good, so I'm going to cut the branch and get started on the
> RC.
>
> * branch-3.0 will advance to 3.0.1-SNAPSHOT
> * branch-3.0.0 will go to 3.0.0
>
> Please keep this in mind when committing.
>
> Cheers,
> Andrew
>


Heads up: branching branch-3.0.0 for GA

2017-11-14 Thread Andrew Wang
Hi folks,

We've resolved all the blockers for 3.0.0 and the release notes and
changelog look good, so I'm going to cut the branch and get started on the
RC.

* branch-3.0 will advance to 3.0.1-SNAPSHOT
* branch-3.0.0 will go to 3.0.0

Please keep this in mind when committing.

Cheers,
Andrew


[jira] [Created] (HADOOP-15037) Add site release notes for OrgQueue and resource types

2017-11-13 Thread Andrew Wang (JIRA)
Andrew Wang created HADOOP-15037:


 Summary: Add site release notes for OrgQueue and resource types
 Key: HADOOP-15037
 URL: https://issues.apache.org/jira/browse/HADOOP-15037
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Andrew Wang
Assignee: Andrew Wang


Let's add some small blurbs and doc links to the site release notes for these 
features.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] A final minor release off branch-2?

2017-11-06 Thread Andrew Wang
What are the known gaps that need bridging between 2.x and 3.x?

>From an HDFS perspective, we've tested wire compat, rolling upgrade, and
rollback.

>From a YARN perspective, we've tested wire compat and rolling upgrade. Arun
just mentioned an NM rollback issue that I'm not familiar with.

Anything else? External to this discussion, these should be documented as
known issues for 3.0.

Best.
Andrew

On Sun, Nov 5, 2017 at 1:46 PM, Arun Suresh  wrote:

> Thanks for starting this discussion VInod.
>
> I agree (C) is a bad idea.
> I would prefer (A) given that ATM, branch-2 is still very close to
> branch-2.9 - and it is a good time to make a collective decision to lock
> down commits to branch-2.
>
> I think we should also clearly define what the 'bridging' release should
> be.
> I assume it means the following:
> * Any 2.x user wanting to move to 3.x must first upgrade to the bridging
> release first and then upgrade to the 3.x release.
> * With regard to state store upgrades (at least NM state stores) the
> bridging state stores should be aware of all new 3.x keys so the implicit
> assumption would be that a user can only rollback from the 3.x release to
> the bridging release and not to the old 2.x release.
> * Use the opportunity to clean up deprecated API ?
> * Do we even want to consider a separate bridging release for 2.7, 2.8 an
> 2.9 lines ?
>
> Cheers
> -Arun
>
> On Fri, Nov 3, 2017 at 5:07 PM, Vinod Kumar Vavilapalli <
> vino...@apache.org>
> wrote:
>
> > Hi all,
> >
> > With 3.0.0 GA around the corner (tx for the push, Andrew!), 2.9.0 RC out
> > (tx Arun / Subru!) and 2.8.2 (tx Junping!), I think it's high time we
> have
> > a discussion on how we manage our developmental bandwidth between 2.x
> line
> > and 3.x lines.
> >
> > Once 3.0 GA goes out, we will have two parallel and major release lines.
> > The last time we were in this situation was back when we did 1.x -> 2.x
> > jump.
> >
> > The parallel releases implies overhead of decisions, branch-merges and
> > back-ports. Right now we already do backports for 2.7.5, 2.8.2, 2.9.1,
> > 3.0.1 and potentially a 3.1.0 in a few months after 3.0.0 GA. And many of
> > these lines - for e.g 2.8, 2.9 - are going to be used for a while at a
> > bunch of large sites! At the same time, our users won't migrate to 3.0 GA
> > overnight - so we do have to support two parallel lines.
> >
> > I propose we start thinking of the fate of branch-2. The idea is to have
> > one final release that helps our users migrate from 2.x to 3.x. This
> > includes any changes on the older line to bridge compatibility issues,
> > upgrade issues, layout changes, tooling etc.
> >
> > We have a few options I think
> >  (A)
> > -- Make 2.9.x the last minor release off branch-2
> > -- Have a maintenance release that bridges 2.9 to 3.x
> > -- Continue to make more maintenance releases on 2.8 and 2.9 as
> > necessary
> > -- All new features obviously only go into the 3.x line as no
> features
> > can go into the maint line.
> >
> >  (B)
> > -- Create a new 2.10 release which doesn't have any new features, but
> > as a bridging release
> > -- Continue to make more maintenance releases on 2.8, 2.9 and 2.10 as
> > necessary
> > -- All new features, other than the bridging changes, go into the 3.x
> > line
> >
> >  (C)
> > -- Continue making branch-2 releases and postpone this discussion for
> > later
> >
> > I'm leaning towards (A) or to a lesser extent (B). Willing to hear
> > otherwise.
> >
> > Now, this obviously doesn't mean blocking of any more minor releases on
> > branch-2. Obviously, any interested committer / PMC can roll up his/her
> > sleeves, create a release plan and release, but we all need to
> acknowledge
> > that versions are not cheap and figure out how the community bandwidth is
> > split overall.
> >
> > Thanks
> > +Vinod
> > PS: The proposal is obviously not to force everyone to go in one
> direction
> > but more of a nudging the community to figure out if we can focus a major
> > part of of our bandwidth on one line. I had a similar concern when we
> were
> > doing 2.8 and 3.0 in parallel, but the impending possibility of spreading
> > too thin is much worse IMO.
> > PPS: (C) is a bad choice. With 2.8 and 2.9 we are already seeing user
> > adoption splintering between two lines. With 2.10, 2.11 etc coexisting
> with
> > 3.0, 3.1 etc, we will revisit the mad phase years ago when we had 0.20.x,
> > 0.20-security coexisting with 0.21, 0.22 etc.
>


[jira] [Created] (HADOOP-15018) Update JAVA_HOME in create-release for Xenial Dockerfile

2017-11-06 Thread Andrew Wang (JIRA)
Andrew Wang created HADOOP-15018:


 Summary: Update JAVA_HOME in create-release for Xenial Dockerfile
 Key: HADOOP-15018
 URL: https://issues.apache.org/jira/browse/HADOOP-15018
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker


create-release expects the Oracle JDK when setting JAVA_HOME. HADOOP-14816 no 
longer includes the Oracle JDK, so we need to update this to point to OpenJDK 
instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



2017-10-31 Hadoop 3 release status update

2017-10-31 Thread Andrew Wang
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+
updates

2017-10-31

Lots of progress towards GA, we look on track for cutting RC0 this week. I
ran the versions script to check the branch matches up with JIRA and fixed
things up, and also checked that the changelog and release notes look
reasonable.

Highlights:

   - Resource types vote has passed and will be merged with branch-3.0
   shortly.
   - Down to three blockers on the dashboard, all being actively revved.

Red flags:

   - Still need to validate that resource types is ready to go once it's
   merged.

Previous tracked GA blockers that have been resolved or dropped:

   - Change of ExecutionType
  - YARN-7178
   - Add
  documentation for Container Update API RESOLVED : Arun got the patch
  in with reviews from Wangda and Haibo.
   - ReservationSystem
  - YARN-4827
  
- Document
  configuration of ReservationSystem for FairScheduler RESOLVED: Yufei
  and Subru got this in.
   - Rolling upgrade
  - YARN-6142
   - Support
  rolling upgrade between 2.x and 3.x RESOLVED : Ray resolved this
  since we think it's sufficiently complete.
   - Erasure coding
  - HDFS-12686
  
- Erasure
  coding system policy state is not correctly saved and loaded during real
  cluster restart RESOLVED: Resolved this one to incorporate it in
  HDFS-12682

GA blockers:

   - Rolling upgrade
  - HDFS-11096
  
- Support
  rolling upgrade between 2.x and 3.xPATCH AVAILABLE: I asked Sean if
  we can downgrade this from blocker
   - Erasure coding
  - HDFS-12682
  
- ECAdmin
  -listPolicies will always show SystemErasureCodingPolicies state
as DISABLED
   PATCH AVAILABLE: Actively being worked on and reviewed, should be in
  soon
  - HDFS-11467
  
- Support
  ErasureCoding section in OIV XML/ReverseXMLPATCH AVAILABLE: Waiting
  on HDFS-12682, I asked if we can work concurrently

Features merged for GA:

   - Erasure coding
  - Testing is still ongoing at Cloudera, no new bugs found recently
  - Closing on remaining blockers for GA
   - Classpath isolation (HADOOP-11656)
   - HADOOP-13916
  
- Document
  how downstream clients should make use of the new shaded client artifacts
   OPEN: Seems unlikely to make it
   - Compat guide (HADOOP-13714
   )
  - HADOOP-14876
  
- Create
  downstream developer docs from the compatibility guidelines PATCH
  AVAILABLE: Patch is being actively revved and reviewed, Robert +1'd,
  Anu posted a big review
  - HADOOP-14875
  
- Create
  end user documentation from the compatibility guidelines PATCH
  AVAILABLE: No patch yet
   - TSv2 alpha 2
   - This was merged, no problems thus far [image: (smile)]
   - API-based scheduler configuration YARN-5734
    - OrgQueue
   for easy CapacityScheduler queue configuration management RESOLVED
  - Merged, no problems thus far [image: (smile)]
   - HDFS router-based configuration HDFS-10467
    -
Router-based
   HDFS federation RESOLVED
  - Merged, no problems thus far [image: (smile)]
   - Resource types YARN-3926
    - Extend
   the YARN resource model for easier resource-type management and profiles
   RESOLVED
  - Vote has passed, Daniel is currently doing the mechanics of merging
  - Need to also perform final validation post-merge

Dropping the "unmerged features" section since we're not letting in
anything else at this point.


[jira] [Resolved] (HADOOP-14555) document how to run wasb tests in azure docs site/testing.md

2017-10-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HADOOP-14555.
--
   Resolution: Duplicate
Fix Version/s: (was: 3.0.0)

> document how to run wasb tests in azure docs site/testing.md
> 
>
> Key: HADOOP-14555
> URL: https://issues.apache.org/jira/browse/HADOOP-14555
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 2.8.1
>Reporter: Steve Loughran
>
> There's no single (current) documentation on running the azure tests
> * There's some in site/index.md, but iit looks potentially out of date 
> (refers to an older azure SDK version)
> * There's a file 
> {{src/test/org/apache/hadoop/fs/azure/RunningLiveWasbTests.txt}}  which 
> refers to a nonexistent doc {{hadoop-tools/hadoop-azure/README.txt }} for 
> instructions.
> Proposed: 
> # move testing docs out of main azure doc page, with link from there. 
> # bring up to date with SDK, move of tests to ITests.
> # purge all other references, including bits of test javadocs which are no 
> longer correct.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-14555) document how to run wasb tests in azure docs site/testing.md

2017-10-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HADOOP-14555:
--

> document how to run wasb tests in azure docs site/testing.md
> 
>
> Key: HADOOP-14555
> URL: https://issues.apache.org/jira/browse/HADOOP-14555
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 2.8.1
>Reporter: Steve Loughran
>
> There's no single (current) documentation on running the azure tests
> * There's some in site/index.md, but iit looks potentially out of date 
> (refers to an older azure SDK version)
> * There's a file 
> {{src/test/org/apache/hadoop/fs/azure/RunningLiveWasbTests.txt}}  which 
> refers to a nonexistent doc {{hadoop-tools/hadoop-azure/README.txt }} for 
> instructions.
> Proposed: 
> # move testing docs out of main azure doc page, with link from there. 
> # bring up to date with SDK, move of tests to ITests.
> # purge all other references, including bits of test javadocs which are no 
> longer correct.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2017-10-24 Thread Andrew Wang
FWIW we've been running branch-3.0 unit tests successfully internally,
though we have separate jobs for Common, HDFS, YARN, and MR. The failures
here are probably a property of running everything in the same JVM, which
I've found problematic in the past due to OOMs.

On Tue, Oct 24, 2017 at 4:04 PM, Allen Wittenauer 
wrote:

>
> My plan is currently to:
>
> *  switch some of Hadoop’s Yetus jobs over to my branch with the YETUS-561
> patch to test it out.
> * if the tests work, work on getting YETUS-561 committed to yetus master
> * switch jobs back to ASF yetus master either post-YETUS-561 or without it
> if it doesn’t work
> * go back to working on something else, regardless of the outcome
>
>
> > On Oct 24, 2017, at 2:55 PM, Chris Douglas  wrote:
> >
> > Sean/Junping-
> >
> > Ignoring the epistemology, it's a problem. Let's figure out what's
> > causing memory to balloon and then we can work out the appropriate
> > remedy.
> >
> > Is this reproducible outside the CI environment? To Junping's point,
> > would YETUS-561 provide more detailed information to aid debugging? -C
> >
> > On Tue, Oct 24, 2017 at 2:50 PM, Junping Du  wrote:
> >> In general, the "solid evidence" of memory leak comes from analysis of
> heapdump, jastack, gc log, etc. In many cases, we can locate/conclude which
> piece of code are leaking memory from the analysis.
> >>
> >> Unfortunately, I cannot find any conclusion from previous comments and
> it even cannot tell which daemons/components of HDFS consumes unexpected
> high memory. Don't sounds like a solid bug report to me.
> >>
> >>
> >>
> >> Thanks,?
> >>
> >>
> >> Junping
> >>
> >>
> >> 
> >> From: Sean Busbey 
> >> Sent: Tuesday, October 24, 2017 2:20 PM
> >> To: Junping Du
> >> Cc: Allen Wittenauer; Hadoop Common; Hdfs-dev;
> mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org
> >> Subject: Re: Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
> >>
> >> Just curious, Junping what would "solid evidence" look like? Is the
> supposition here that the memory leak is within HDFS test code rather than
> library runtime code? How would such a distinction be shown?
> >>
> >> On Tue, Oct 24, 2017 at 4:06 PM, Junping Du  > wrote:
> >> Allen,
> >> Do we have any solid evidence to show the HDFS unit tests going
> through the roof are due to serious memory leak by HDFS? Normally, I don't
> expect memory leak are identified in our UTs - mostly, it (test jvm gone)
> is just because of test or deployment issues.
> >> Unless there is concrete evidence, my concern on seriously memory
> leak for HDFS on 2.8 is relatively low given some companies (Yahoo,
> Alibaba, etc.) have deployed 2.8 on large production environment for
> months. Non-serious memory leak (like forgetting to close stream in
> non-critical path, etc.) and other non-critical bugs always happens here
> and there that we have to live with.
> >>
> >> Thanks,
> >>
> >> Junping
> >>
> >> 
> >> From: Allen Wittenauer >
> >> Sent: Tuesday, October 24, 2017 8:27 AM
> >> To: Hadoop Common
> >> Cc: Hdfs-dev; mapreduce-...@hadoop.apache.org hadoop.apache.org>; yarn-...@hadoop.apache.org yarn-...@hadoop.apache.org>
> >> Subject: Re: Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
> >>
> >>> On Oct 23, 2017, at 12:50 PM, Allen Wittenauer <
> a...@effectivemachines.com> wrote:
> >>>
> >>>
> >>>
> >>> With no other information or access to go on, my current hunch is that
> one of the HDFS unit tests is ballooning in memory size.  The easiest way
> to kill a Linux machine is to eat all of the RAM, thanks to overcommit and
> that's what this "feels" like.
> >>>
> >>> Someone should verify if 2.8.2 has the same issues before a release
> goes out ...
> >>
> >>
> >>FWIW, I ran 2.8.2 last night and it has the same problems.
> >>
> >>Also: the node didn't die!  Looking through the workspace (so
> the next run will destroy them), two sets of logs stand out:
> >>
> >> https://builds.apache.org/job/hadoop-qbt-branch2-java7-
> linux-x86/ws/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
> >>
> >>and
> >>
> >> https://builds.apache.org/job/hadoop-qbt-branch2-java7-
> linux-x86/ws/sourcedir/hadoop-hdfs-project/hadoop-hdfs/
> >>
> >>It looks like my hunch is correct:  RAM in the HDFS unit tests
> are going through the roof.  It's also interesting how MANY log files there
> are.  Is surefire not picking up that jobs are dying?  Maybe not if memory
> is getting tight.
> >>
> >>Anyway, at the point, branch-2.8 and higher are probably
> fubar'd. Additionally, I've filed YETUS-561 so that Yetus-controlled Docker
> containers can 

2017-10-20 Hadoop 3 release status update

2017-10-20 Thread Andrew Wang
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

2017-10-20

Apologies for skipping the update last week. Here's how we're tracking for
GA.

Highlights:

   - Merge of HDFS router-based federation and API-based scheduler
   configuration with no reported problems. Kudos to the contributors involved!

Red flags:

   - We're making a last-minute push to get resource types (but not
   resource profiles in). Coming this late, it's a risk, but we decided it's
   worthwhile for this feature. See Daniel's yarn-dev email
   

for
   the full rationale.
   - Still uncovering EC bugs from testing

Previously tracked GA blockers that have been resolved or dropped:

   - YARN-6623
    - Add
   support to turn off launching privileged containers in the
   container-executor RESOLVED: Committed and resolved
   - Change of ExecutionType
  - YARN-7275
   - NM
  Statestore cleanup for Container updates RESOLVED : Patch committed,
  resolved.
   - ReservationSystem
  - YARN-4859
   - [Bug]
  Unable to submit a job to a reservation when using FairScheduler
  RESOLVED: Yufei tested this and found things mostly worked, filed two
  not-blocker followons: YARN-7347
   - Fixe
  the bug in Fair scheduler to handle a queue named "root.root" OPEN
   and YARN-7348
   - Ignore
  the vcore in reservation request for fair policy queue OPEN

GA blockers:

   - Change of ExecutionType
  - YARN-7178
   - Add
  documentation for Container Update API OPEN : Still no update from
  Arun, I pinged it.
   - ReservationSystem
  - YARN-4827
  
- Document
  configuration of ReservationSystem for FairScheduler OPEN: Yufei said
  he'd work on it as of 2 days ago
   - Rolling upgrade
  - YARN-6142
   - Support
  rolling upgrade between 2.x and 3.x OPEN : I pinged this and asked
  for a status update
  - HDFS-11096
  
- Support
  rolling upgrade between 2.x and 3.xPATCH AVAILABLE: I pinged this and
  asked for a status update
   - Erasure coding
  - HDFS-12682
  
- ECAdmin
  -listPolicies will always show policy state as DISABLED OPEN: New
  blocker filed this week, Xiao is working on it
  - HDFS-12686
  
- Erasure
  coding system policy state is not correctly saved and loaded during real
  cluster restart OPEN: New blocker filed this week, Sammi is on it
  - HDFS-12686
  
- Erasure
  coding system policy state is not correctly saved and loaded during real
  cluster restart OPEN: Old blocker, Huafeng is on it, waiting on
  review from Wei-Chiu or Sammi

Features merged for GA:

   - Erasure coding
  - Continued bug reporting and fixing based on testing at Cloudera.
  - Two new blockers filed this week, mentioned above.
  - Huafeng completed patch to reenable disabled EC tests
   - Classpath isolation (HADOOP-11656)
   - HADOOP-13916
  
- Document
  how downstream clients should make use of the new shaded client artifacts
   IN PROGRESS: I pinged it
   - Compat guide (HADOOP-13714
   )
  - HADOOP-14876
  
- Create
  downstream developer docs from the compatibility guidelines PATCH
  AVAILABLE: Daniel has a patch up, revved based on Steve's review
  feedback, waiting on Steve's reply
  - HADOOP-14875
  
- Create
  end user documentation from the compatibility guidelines OPEN: No
  patch yet
   - TSv2 alpha 2
   - This was merged, no problems thus far [image: (smile)]
   - API-based scheduler configuration YARN-5734
    - OrgQueue
   for easy CapacityScheduler queue configuration management RESOLVED
  - Merged, no problems thus far [image: (smile)]
   - HDFS router-based configuration HDFS-10467
   

Re: 2017-10-06 Hadoop 3 release status update

2017-10-06 Thread Andrew Wang
Thanks for the update Allen, appreciate your continued help reviewing this
feature.

Looking at the calendar, we have three weeks from when we want to have GA
RC0 out for vote. We're already dipping into code freeze time landing HDFS
router-based federation and API-based scheduler configuration next week. If
we want to get any more features in, it means slipping the GA date.

So, my current thinking is that we should draw a line after these pending
branches merge. Like before, I'm willing to bend on this if there are
strong arguments, but the quality bar is even higher than it was for beta1,
and we've still got plenty of other blockers/criticals to work on for GA.

If you feel differently, please reach out, I can make myself very available
next week for a call.

Best,
Andrew

On Fri, Oct 6, 2017 at 3:12 PM, Allen Wittenauer <a...@effectivemachines.com>
wrote:

>
> > On Oct 6, 2017, at 1:31 PM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
> >
> >   - Still waiting on Allen to review YARN native services feature.
>
> Fake news.
>
> I’m still -1 on it, at least prior to a patch that posted late
> yesterday. I’ll probably have a chance to play with it early next week.
>
>
> Key problems:
>
> * still haven’t been able to bring up dns daemon due to lacking
> documentation
>
> * it really needs better naming and command structures.  When put
> into the larger YARN context, it’s very problematic:
>
> $ yarn —daemon start resourcemanager
>
> vs.
>
> $ yarn —daemon start apiserver
>
> if you awoke from a deep sleep from inside a cave, which
> one would you expect to “start YARN”? Made worse that the feature is
> called “YARN services” all over the place.
>
> $ yarn service foo
>
> … what does this even mean?
>
> It would be great if other outsiders really looked hard at this
> branch to give the team feedback.   Once it gets released, it’s gonna be
> too late to change it….
>
> As a sidenote:
>
> It’d be great if the folks working on YARN spent some time
> consolidating daemons.  With this branch, it now feels like we’re
> approaching the double digit area of daemons to turn on all the features.
> It’s well past ridiculous, especially considering we still haven’t replaced
> the MRJHS’s feature set to the point we can turn it off.
>
>


2017-10-06 Hadoop 3 release status update

2017-10-06 Thread Andrew Wang
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

2017-10-06

The beta1 RC0 vote passed, and beta1 is out! Now tracking GA features.

Highlights:

   - 3.0.0-beta1 has been released!
   - Router-based federation merge vote should be about to pass
   - API-based scheduler configuration merge vote is out, has the votes so
   far

Red flags:

   - Still need to nail down whether we're going to try and merge resource
   profiles. I've been emailing with Wangda and Daniel about this, we need to
   reach a decision ASAP (might already be too late).
   - Still waiting on Allen to review YARN native services feature.

Previously tracked GA blockers that have been resolved or dropped:

   - YARN-7134
    -
AppSchedulingInfo
   has a dependency on capacity schedulerOPEN:  Wangda downgraded this to
   "Major", dropping from list.

GA blockers:

   - YARN-6623
    - Add
   support to turn off launching privileged containers in the
   container-executor PATCH AVAILABLE: Actively being reviewed
   - Change of ExecutionType
  - YARN-7275
   - NM
  Statestore cleanup for Container updatesPATCH AVAILABLE: Kartheek has
  posted a patch, waiting for review
  - YARN-7178
   - Add
  documentation for Container Update API OPEN : No update from Arun,
  though it's just a docs patch
   - ReservationSystem
  - YARN-4859
   - [Bug]
  Unable to submit a job to a reservation when using FairScheduler OPEN:
  Yufei has picked this up
  - YARN-4827
  
- Document
  configuration of ReservationSystem for FairScheduler OPEN: Yufei has
  picked this up, just a docs patch
   - Rolling upgrade
  - YARN-6142
   - Support
  rolling upgrade between 2.x and 3.x OPEN : Ray is still going through
  JACC and proto output
  - HDFS-11096
  
- Support
  rolling upgrade between 2.x and 3.xPATCH AVAILABLE: Sean has revved
  the patch and is waiting on reviews from Ray, Allen

Features merged for GA:

   - Erasure coding
  - Continued bug reporting and fixing based on testing at Cloudera.
  - Still need to finish the 3.0 must-do's
   - Classpath isolation (HADOOP-11656)
   - HADOOP-14771 is still floating, along with adding documentation.
   - Compat guide (HADOOP-13714
   )
  - Synced with Daniel, he plans to wrap up the remaining  stuff next
  week
   - TSv2 alpha 2
   - This was merged, no problems thus far [image: (smile)]

Unmerged features:

   - Resource types / profiles (YARN-3926
    and YARN-7069
   ) (Wangda Tan)
  - This has been merged for 3.1.0, YARN-7069 tracks follow on work
  - Wangda said that he's okay waiting for 3.1.0 for this, we're
  waiting on Daniel. I synced with Daniel earlier this week, and
he wants to
  try and get some of it into 3.0.0. Waiting on an update.
  - I still need a JIRA query for tracking the state of this.
   - HDFS router-based federation (HDFS-10467
   ) (Inigo Goiri and
   Chris Douglas)
   - Merge vote should close any minute now
   - API-based scheduler configuration (Jonathan Hung)
  - Merge vote is out, will close next week
   - YARN native services (YARN-5079
   ) (Jian He)
  - Subtasks were filed to address Allen's review comments from the
  previous merge vote, only one pending
  - We need to confirm with Allen that this is ready to go, he hasn't
  been reviewing


Re: [VOTE] Release Apache Hadoop 3.0.0-beta1 RC0

2017-10-04 Thread Andrew Wang
Thanks for the additional review Rohith, much appreciated!

On Wed, Oct 4, 2017 at 12:14 AM, Rohith Sharma K S <
rohithsharm...@apache.org> wrote:

> +1 (binding)
>
> Built from source and deployed YARN HA cluster with ATSv2 enabled in
> non-secured cluster.
> - tested for RM HA/work-preservring-restart/ NM-work-preserving restart
> for ATSv2 entities.
> - verified all ATSv2 REST end points to retrieve the entities
> - ran sample MR jobs and distributed jobs
>
> Thanks & Regards
> Rohith Sharma K S
>
> On 4 October 2017 at 05:31, Andrew Wang <andrew.w...@cloudera.com> wrote:
>
>> Thanks everyone for voting! With 4 binding +1s and 7 non-binding +1s, the
>> vote passes.
>>
>> I'll get started on pushing out the release.
>>
>> Best,
>> Andrew
>>
>> On Tue, Oct 3, 2017 at 3:45 PM, Aaron Fabbri <fab...@cloudera.com> wrote:
>>
>> > +1
>> >
>> > Built from source.  Ran S3A integration tests in us-west-2 with S3Guard
>> > (both Local and Dynamo metadatastore).
>> >
>> > Everything worked fine except I hit one integration test failure.  It
>> is a
>> > minor test issue IMO and I've filed HADOOP-14927
>> >
>> > Failed tests:
>> >   ITestS3GuardToolDynamoDB>AbstractS3GuardToolTestBase.testDe
>> stroyNoBucket:228
>> > Expected an exception, got 0
>> >   ITestS3GuardToolLocal>AbstractS3GuardToolTestBase.testDestr
>> oyNoBucket:228
>> > Expected an exception, got 0
>> >
>> >
>> >
>> > On Tue, Oct 3, 2017 at 2:45 PM, Ajay Kumar <ajay.ku...@hortonworks.com>
>> > wrote:
>> >
>> >> +1 (non-binding)
>> >>
>> >> - built from source
>> >> - deployed on single node cluster
>> >> - Basic hdfs operations
>> >> - Run wordcount on a text file
>> >> Thanks,
>> >> Ajay
>> >>
>> >>
>> >> On 10/3/17, 1:04 PM, "Eric Badger" <ebad...@oath.com.INVALID> wrote:
>> >>
>> >> +1 (non-binding)
>> >>
>> >> - Verified all checksums and signatures
>> >> - Built native from source on macOS 10.12.6 and RHEL 7.1
>> >> - Deployed a single node pseudo cluster
>> >> - Ran pi and sleep jobs
>> >> - Verified Docker was marked as experimental
>> >>
>> >> Thanks,
>> >>
>> >> Eric
>> >>
>> >> On Tue, Oct 3, 2017 at 1:41 PM, John Zhuge <john.zh...@gmail.com>
>> >> wrote:
>> >>
>> >> > +1 (binding)
>> >> >
>> >> >- Verified checksums and signatures of all tarballs
>> >> >- Built source with native, Java 1.8.0_131-b11 on Mac OS X
>> >> 10.12.6
>> >> >- Verified cloud connectors:
>> >> >   - All S3A integration tests
>> >> >   - All ADL live unit tests
>> >> >- Deployed both binary and built source to a pseudo cluster,
>> >> passed the
>> >> >following sanity tests in insecure, SSL, and SSL+Kerberos
>> mode:
>> >> >   - HDFS basic and ACL
>> >> >   - DistCp basic
>> >> >   - MapReduce wordcount (only failed in SSL+Kerberos mode for
>> >> binary
>> >> >   tarball, probably unrelated)
>> >> >   - KMS and HttpFS basic
>> >> >   - Balancer start/stop
>> >> >
>> >> > Hit the following errors but they don't seem to be blocking:
>> >> >
>> >> > == Missing dependencies during build ==
>> >> >
>> >> > > ERROR: hadoop-aliyun has missing dependencies:
>> json-lib-jdk15.jar
>> >> > > ERROR: hadoop-azure has missing dependencies:
>> >> jetty-util-ajax-9.3.19.
>> >> > > v20170502.jar
>> >> > > ERROR: hadoop-azure-datalake has missing dependencies:
>> >> okhttp-2.4.0.jar
>> >> > > ERROR: hadoop-azure-datalake has missing dependencies:
>> >> okio-1.4.0.jar
>> >> >
>> >> >
>> >> > Filed HADOOP-14923, HADOOP-14924, and HADOOP-14925.
>> >> >
>> >> > == Unit tests failed in Kerberos+SSL mode for KMS and HttpFs
>> >> default HTTP
>> >> > servlet /c

[ANNOUNCE] Apache Hadoop 3.0.0-beta1 has been released

2017-10-04 Thread Andrew Wang
Hi all,

I'm pleased to announce the release of Apache Hadoop 3.0.0-beta1. This is
our first beta release in the 3.0.0 release line, and is planned to be the
last release before 3.0.0 GA. Beta releases are API stable but come with no
guarantee of quality, and are not intended for production use.

3.0.0-beta1 comprises 576 bug fixes, improvements, and other enhancements
since 3.0.0-alpha4. The full changelog [1] and release notes [2] are
available on the website, along with a higher-level description of major
changes in 3.0.0 [3].

Major features since alpha4 include the addition of S3Guard [4], which adds
strong consistency and performance improvements to the S3A filesystem
connector, and YARN Timeline Service v2 alpha2 [5], which further improves
on TSv2 alpha1 included in earlier 3.0.0 releases.

Since 3.0.0 GA is planned for next month, users are highly encouraged to
test this beta release. Let us know on the lists if we can help with your
testing efforts.

Thanks as always to the many contributors who helped with this release!

Cheers,
Andrew

[1]:
http://hadoop.apache.org/docs/r3.0.0-beta1/hadoop-project-dist/hadoop-common/release/3.0.0-beta1/CHANGES.3.0.0-beta1.html
[2]:
http://hadoop.apache.org/docs/r3.0.0-beta1/hadoop-project-dist/hadoop-common/release/3.0.0-beta1/RELEASENOTES.3.0.0-beta1.html
[3]: http://hadoop.apache.org/docs/r3.0.0-beta1/index.html
[4]:
http://hadoop.apache.org/docs/r3.0.0-beta1/hadoop-aws/tools/hadoop-aws/s3guard.html
[5]:
http://hadoop.apache.org/docs/r3.0.0-beta1/hadoop-yarn/hadoop-yarn-site/TimelineServiceV2.html


Re: [VOTE] Release Apache Hadoop 3.0.0-beta1 RC0

2017-10-03 Thread Andrew Wang
Thanks everyone for voting! With 4 binding +1s and 7 non-binding +1s, the
vote passes.

I'll get started on pushing out the release.

Best,
Andrew

On Tue, Oct 3, 2017 at 3:45 PM, Aaron Fabbri <fab...@cloudera.com> wrote:

> +1
>
> Built from source.  Ran S3A integration tests in us-west-2 with S3Guard
> (both Local and Dynamo metadatastore).
>
> Everything worked fine except I hit one integration test failure.  It is a
> minor test issue IMO and I've filed HADOOP-14927
>
> Failed tests:
>   ITestS3GuardToolDynamoDB>AbstractS3GuardToolTestBase.testDestroyNoBucket:228
> Expected an exception, got 0
>   ITestS3GuardToolLocal>AbstractS3GuardToolTestBase.testDestroyNoBucket:228
> Expected an exception, got 0
>
>
>
> On Tue, Oct 3, 2017 at 2:45 PM, Ajay Kumar <ajay.ku...@hortonworks.com>
> wrote:
>
>> +1 (non-binding)
>>
>> - built from source
>> - deployed on single node cluster
>> - Basic hdfs operations
>> - Run wordcount on a text file
>> Thanks,
>> Ajay
>>
>>
>> On 10/3/17, 1:04 PM, "Eric Badger" <ebad...@oath.com.INVALID> wrote:
>>
>> +1 (non-binding)
>>
>> - Verified all checksums and signatures
>> - Built native from source on macOS 10.12.6 and RHEL 7.1
>> - Deployed a single node pseudo cluster
>> - Ran pi and sleep jobs
>> - Verified Docker was marked as experimental
>>
>> Thanks,
>>
>> Eric
>>
>> On Tue, Oct 3, 2017 at 1:41 PM, John Zhuge <john.zh...@gmail.com>
>> wrote:
>>
>> > +1 (binding)
>> >
>> >- Verified checksums and signatures of all tarballs
>> >- Built source with native, Java 1.8.0_131-b11 on Mac OS X
>> 10.12.6
>> >- Verified cloud connectors:
>> >   - All S3A integration tests
>> >   - All ADL live unit tests
>> >- Deployed both binary and built source to a pseudo cluster,
>> passed the
>> >following sanity tests in insecure, SSL, and SSL+Kerberos mode:
>> >   - HDFS basic and ACL
>> >   - DistCp basic
>> >   - MapReduce wordcount (only failed in SSL+Kerberos mode for
>> binary
>> >   tarball, probably unrelated)
>> >   - KMS and HttpFS basic
>> >   - Balancer start/stop
>> >
>> > Hit the following errors but they don't seem to be blocking:
>> >
>> > == Missing dependencies during build ==
>> >
>> > > ERROR: hadoop-aliyun has missing dependencies: json-lib-jdk15.jar
>> > > ERROR: hadoop-azure has missing dependencies:
>> jetty-util-ajax-9.3.19.
>> > > v20170502.jar
>> > > ERROR: hadoop-azure-datalake has missing dependencies:
>> okhttp-2.4.0.jar
>> > > ERROR: hadoop-azure-datalake has missing dependencies:
>> okio-1.4.0.jar
>> >
>> >
>> > Filed HADOOP-14923, HADOOP-14924, and HADOOP-14925.
>> >
>> > == Unit tests failed in Kerberos+SSL mode for KMS and HttpFs
>> default HTTP
>> > servlet /conf, /stacks, and /logLevel ==
>> >
>> > One example below:
>> >
>> > >Connecting to
>> > > https://localhost:14000/logLevel?log=org.apache.hadoop.fs.
>> http.server.
>> > HttpFSServer
>> > >Exception in thread "main"
>> > > org.apache.hadoop.security.authentication.client.
>> > AuthenticationException:
>> > > Authentication failed, URL:
>> > > https://localhost:14000/logLevel?log=org.apache.hadoop.fs.
>> http.server.
>> > HttpFSServer=jzhuge,
>> > > status: 403, message: GSSException: Failure unspecified at
>> GSS-API level
>> > > (Mechanism level: Request is a replay (34))
>> >
>> >
>> > The /logLevel failure will affect command "hadoop daemonlog".
>> >
>> >
>> > On Tue, Oct 3, 2017 at 10:56 AM, Andrew Wang <
>> andrew.w...@cloudera.com>
>> > wrote:
>> >
>> > > Thanks for all the votes thus far! We've gotten the binding +1's
>> to close
>> > > the release, though are there contributors who could kick the
>> tires on
>> > > S3Guard and YARN TSv2 alpha2? These are the two new features
>> merged since
>> > > alpha4, so it'd be good to get some coverage.
>>  

Re: [VOTE] Release Apache Hadoop 3.0.0-beta1 RC0

2017-10-03 Thread Andrew Wang
Thanks for all the votes thus far! We've gotten the binding +1's to close
the release, though are there contributors who could kick the tires on
S3Guard and YARN TSv2 alpha2? These are the two new features merged since
alpha4, so it'd be good to get some coverage.



On Tue, Oct 3, 2017 at 9:45 AM, Brahma Reddy Battula <bra...@apache.org>
wrote:

>
> Thanks Andrew.
>
> +1 (non binding)
>
> --Built from source
> --installed 3 node HA cluster
> --Verified shell commands and UI
> --Ran wordcount/pic jobs
>
>
>
>
> On Fri, 29 Sep 2017 at 5:34 AM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
>
>> Hi all,
>>
>> Let me start, as always, by thanking the many, many contributors who
>> helped
>> with this release! I've prepared an RC0 for 3.0.0-beta1:
>>
>> http://home.apache.org/~wang/3.0.0-beta1-RC0/
>>
>> This vote will run five days, ending on Nov 3rd at 5PM Pacific.
>>
>> beta1 contains 576 fixed JIRA issues comprising a number of bug fixes,
>> improvements, and feature enhancements. Notable additions include the
>> addition of YARN Timeline Service v2 alpha2, S3Guard, completion of the
>> shaded client, and HDFS erasure coding pluggable policy support.
>>
>> I've done the traditional testing of running a Pi job on a pseudo cluster.
>> My +1 to start.
>>
>> We're working internally on getting this run through our integration test
>> rig. I'm hoping Vijay or Ray can ring in with a +1 once that's complete.
>>
>> Best,
>> Andrew
>>
> --
>
>
>
> --Brahma Reddy Battula
>


2017-09-20 Hadoop 3 release status update

2017-09-29 Thread Andrew Wang
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

2017-09-29

After about a month of slip, RC0 has been sent out for a VOTE. Focus now
turns to GA, where we will attempt to keep the original beta1 target date
(early November).

Highlights:

   - RC0 vote was sent out on Thursday, two binding +1's so far.

Red flags:

   - Resource profiles still has a number of pending subtasks, which is
   concerning from a schedule perspective. I emailed Wangda about this, and we
   need to discuss with other key contributors.
   - Native services has one pending subtask but we haven't gotten
   follow-on reviews from Allen (who -1'd the earlier merge vote). Need to
   confirm that we've satisfied his feedback.

Previously tracked beta1 blockers that have been resolved or dropped:

   - YARN-6623 was pushed out of beta1 to GA, has been committed so we can
   drop it from tracking.
   - HADOOP-14897  (Loosen
   compatibility guidelines for native dependencies): Patch committed!

beta1 blockers:

   - None, RC0 is out

GA blockers:

   - YARN-7134
    -
AppSchedulingInfo
   has a dependency on capacity scheduler OPEN  : this one popped out of
   nowhere, I don't have an update yet.
   - YARN-7178
    - Add
   documentation for Container Update API OPEN : this also popped out of
   nowhere, no update yet.
   - YARN-7275
    - NM
   Statestore cleanup for Container updates OPEN : Ditto
   - YARN-4859
    - [Bug]
   Unable to submit a job to a reservation when using FairScheduler OPEN :
   Ditto
   - YARN-4827
    - Document
   configuration of ReservationSystem for FairScheduler OPEN : Ditto

Features merged for GA:

   - Erasure coding
  - People are looking more at the flaky tests and nice-to-haves
  - Some bugs reported and being fixed based on testing at Cloudera
  - Need to finish the 3.0 must-do's.
   - Addressing incompatible changes (YARN-6142 and HDFS-11096)
   - Sean has posted a new rev of the rolling upgrade script
  - Some YARN PB backward compat issues that we decided weren't
  blockers and are scheduled for GA
   - Classpath isolation (HADOOP-11656)
  - HADOOP-13917
 (Ensure
  nightly builds run the integration tests for the shaded client):
Resolved,
  Sean retriggered and determined that this works.
  - HADOOP-14771 is still floating, along with adding documentation.
   - Compat guide (HADOOP-13714
   )
  - A few subtasks are targeted at GA
   - TSv2 alpha 2
   - This was merged, no problems thus far [image: (smile)]

Unmerged features:

   - Resource profiles (YARN-3926
    and YARN-7069
   ) (Wangda Tan)
  - This has been merged for 3.1.0, YARN-7069 tracks follow on work
  - ~7 patch available subtasks, I asked Wangda to set up a JIRA query
  for tracking this
   - HDFS router-based federation (HDFS-10467
   ) (Inigo Goiri and
   Chris Douglas)
   - Inigo sent out the merge vote
   - API-based scheduler configuration (Jonathan Hung)
  - Jonathan sent out a discuss thread for merge, thinking is early
  next week. Larry did a security-oriented review.
   - YARN native services (YARN-5079
   ) (Jian He)
  - Subtasks were filed to address Allen's review comments from the
  previous merge vote, only one pending
  - We need to confirm with Allen that this is ready to go, he hasn't
  been reviewing


Re: [DISCUSS] HADOOP-9122 Add power mock library for writing better unit tests

2017-09-29 Thread Andrew Wang
Making code testable is about a lot more than your mocking library. In
HDFS, the NameNode is very monolithic, so it's hard to instantiate little
pieces of functionality in isolation and mock things out. I have my doubts
that Powermock will help with this unless someone's willing to invest in
significant refactoring effort of our existing code and unit test suites.

I could see an argument of this being useful for new code being developed,
but like Chris, I'd like to see an example of where Mockito falls short,
and what additional capabilities Powermock brings to the table.

Best,
Andrew

On Fri, Sep 29, 2017 at 10:38 AM, Chris Douglas  wrote:

> Eric-
>
> Can you explain how Powermock differs from/augments Mockito, why we
> should adopt it, and maybe an example of an existing test that could
> be improved using this library? -C
>
> On Fri, Sep 29, 2017 at 10:12 AM, Eric Yang  wrote:
> > Hi Hadoop-dev,
> >
> > Long time ago, Hadoop community decided to put Powermock on hold for
> unit tests.  Both mockito and powermock has evolved a lot in the past 5
> years.  There are mature versions of both software, and there are
> compatibility charts to indicate which versions can work together.  Hadoop
> has grown a lot in the last 5 years.  It becomes apparent that without
> ability to instrument lower level classes to contain unit test scope.  Many
> tests are written to simulate integration test in order to perform unit
> tests.  The result is slow performance on unit tests, and some parts are
> not testable strictly in unit test case.  This discussion is to revisit the
> decision, and see if we would embrace Powermock and allow HADOOP-9122 to be
> implemented.  Feel free to comment on HADOOP-9122 and this thread to
> revisit this issue.
> >
> > Thank you for your time.
> >
> > Regards,
> > Eric
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


Re: [DISCUSS] Merging API-based scheduler configuration to trunk/branch-2

2017-09-29 Thread Andrew Wang
Hi Jonathan,

I'm okay with putting this into branch-3.0 for GA if it can be merged
within the next two weeks. Even though beta1 has slipped by a month, I want
to stick to the targeted GA data of Nov 1st as much as possible. Of course,
let's not sacrifice quality or stability for speed; if something's not
ready, let's defer it to 3.1.0.

Subru, have you been able to review this feature from the 2.9.0
perspective? It'd add confidence if you think it's immediately ready for
merging to branch-2 for 2.9.0.

Thanks,
Andrew

On Thu, Sep 28, 2017 at 11:32 AM, Jonathan Hung 
wrote:

> Hi everyone,
>
> Starting this thread to discuss merging API-based scheduler configuration
> to trunk/branch-2. The feature adds the framework for allowing users to
> modify scheduler configuration via REST or CLI using a configurable backend
> (leveldb/zk are currently supported), and adds capacity scheduler support
> for this. The umbrella JIRA is YARN-5734. All the required work for this
> feature is done and committed to branch YARN-5734, and a full diff has been
> generated at YARN-7241.
>
> Regarding compatibility, this feature is configurable and turned off by
> default.
>
> The feature has been tested locally on a couple RMs (since it is an RM
> only change), with queue addition/removal/updates tested on single RM
> (leveldb) and two RMs (zk). Also we verified the original configuration
> update mechanism (via refreshQueues) is unaffected when the feature is
> off/not configured.
>
> Our original plan was to merge this to trunk (which is what the YARN-7241
> diff is based on), and port to branch-2 before the 2.9 release. @Andrew,
> what are your thoughts on also merging this to branch-3.0?
>
> Thanks!
>
> Jonathan Hung
>


[VOTE] Release Apache Hadoop 3.0.0-beta1 RC0

2017-09-28 Thread Andrew Wang
Hi all,

Let me start, as always, by thanking the many, many contributors who helped
with this release! I've prepared an RC0 for 3.0.0-beta1:

http://home.apache.org/~wang/3.0.0-beta1-RC0/

This vote will run five days, ending on Nov 3rd at 5PM Pacific.

beta1 contains 576 fixed JIRA issues comprising a number of bug fixes,
improvements, and feature enhancements. Notable additions include the
addition of YARN Timeline Service v2 alpha2, S3Guard, completion of the
shaded client, and HDFS erasure coding pluggable policy support.

I've done the traditional testing of running a Pi job on a pseudo cluster.
My +1 to start.

We're working internally on getting this run through our integration test
rig. I'm hoping Vijay or Ray can ring in with a +1 once that's complete.

Best,
Andrew


Re: Heads up: branching branch-3.0.0-beta1 off of branch-3.0

2017-09-28 Thread Andrew Wang
Branch has been cut, branch-3.0 is now open for commits for 3.0.0 GA.

HEAD of branch-3.0.0-beta1 is 2223393ad1d5ffdd62da79e1546de79c6259dc12.

On Thu, Sep 28, 2017 at 10:52 AM, Andrew Wang <andrew.w...@cloudera.com>
wrote:

> Hi folks,
>
> We've driven the blocker count down to 0, and I went through and made sure
> the fix versions and release notes and so on are all lined up.
>
> I'm going to cut branch-3.0.0-beta1 off branch-3.0 and try and get RC0 out
> today.
>
> Cheers,
> Andrew
>


Heads up: branching branch-3.0.0-beta1 off of branch-3.0

2017-09-28 Thread Andrew Wang
Hi folks,

We've driven the blocker count down to 0, and I went through and made sure
the fix versions and release notes and so on are all lined up.

I'm going to cut branch-3.0.0-beta1 off branch-3.0 and try and get RC0 out
today.

Cheers,
Andrew


[jira] [Reopened] (HADOOP-13917) Ensure yetus personality runs the integration tests for the shaded client

2017-09-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HADOOP-13917:
--

> Ensure yetus personality runs the integration tests for the shaded client
> -
>
> Key: HADOOP-13917
> URL: https://issues.apache.org/jira/browse/HADOOP-13917
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, test
>Affects Versions: 3.0.0-alpha2
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-13917.WIP.0.patch, HADOOP-14771.02.patch
>
>
> Either QBT or a different jenkins job should run our integration tests, 
> specifically the ones added for the shaded client.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-13917) Ensure yetus personality runs the integration tests for the shaded client

2017-09-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HADOOP-13917.
--
Resolution: Delivered

> Ensure yetus personality runs the integration tests for the shaded client
> -
>
> Key: HADOOP-13917
> URL: https://issues.apache.org/jira/browse/HADOOP-13917
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, test
>Affects Versions: 3.0.0-alpha2
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-13917.WIP.0.patch, HADOOP-14771.02.patch
>
>
> Either QBT or a different jenkins job should run our integration tests, 
> specifically the ones added for the shaded client.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14545) Uninitialized S3A instance NPEs on toString()

2017-09-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HADOOP-14545.
--
Resolution: Duplicate

> Uninitialized S3A instance NPEs on toString()
> -
>
> Key: HADOOP-14545
> URL: https://issues.apache.org/jira/browse/HADOOP-14545
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 3.0.0-beta1
>
>
> You can't log an uninited S3AFileSystem instance without getting a stack trace
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getDefaultBlockSize(S3AFileSystem.java:2131)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.toString(S3AFileSystem.java:2148)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-14545) Uninitialized S3A instance NPEs on toString()

2017-09-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HADOOP-14545:
--

> Uninitialized S3A instance NPEs on toString()
> -
>
> Key: HADOOP-14545
> URL: https://issues.apache.org/jira/browse/HADOOP-14545
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 3.0.0-beta1
>
>
> You can't log an uninited S3AFileSystem instance without getting a stack trace
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getDefaultBlockSize(S3AFileSystem.java:2131)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.toString(S3AFileSystem.java:2148)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14834) Remove original S3A output stream

2017-09-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HADOOP-14834.
--
Resolution: Duplicate

> Remove original S3A output stream
> -
>
> Key: HADOOP-14834
> URL: https://issues.apache.org/jira/browse/HADOOP-14834
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 3.0.0-beta1
>
>
> The S3A Block output stream is working well and much better than the original 
> stream in terms of: scale, performance, instrumentation, robustness
> Proposed: switch this to be the default, as a precursor to removing it later 
> HADOOP-14746



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-14834) Remove original S3A output stream

2017-09-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HADOOP-14834:
--

> Remove original S3A output stream
> -
>
> Key: HADOOP-14834
> URL: https://issues.apache.org/jira/browse/HADOOP-14834
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 3.0.0-beta1
>
>
> The S3A Block output stream is working well and much better than the original 
> stream in terms of: scale, performance, instrumentation, robustness
> Proposed: switch this to be the default, as a precursor to removing it later 
> HADOOP-14746



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14879) Build failure due to failing hadoop-client-check-invariants for hadoop-client-runtime.jar

2017-09-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HADOOP-14879.
--
Resolution: Done

> Build failure due to failing hadoop-client-check-invariants for 
> hadoop-client-runtime.jar
> -
>
> Key: HADOOP-14879
> URL: https://issues.apache.org/jira/browse/HADOOP-14879
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.1.0
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Blocker
> Fix For: 3.0.0-beta1
>
>
> {noformat}
> [ERROR] Found artifact with unexpected contents: 
> '/.../hadoop-client-modules/hadoop-client-runtime/target/hadoop-client-runtime-3.1.0-SNAPSHOT.jar'
> Please check the following and either correct the build or update
> the allowed list with reasoning.
> javax/
> javax/inject/
> javax/inject/Inject.class
> javax/inject/Named.class
> javax/inject/Provider.class
> javax/inject/Qualifier.class
> javax/inject/Scope.class
> javax/inject/Singleton.class
> jersey/
> jersey/repackaged/
> jersey/repackaged/org/
> jersey/repackaged/org/objectweb/
> jersey/repackaged/org/objectweb/asm/
> jersey/repackaged/org/objectweb/asm/AnnotationVisitor.class
> jersey/repackaged/org/objectweb/asm/AnnotationWriter.class
> jersey/repackaged/org/objectweb/asm/Attribute.class
> jersey/repackaged/org/objectweb/asm/ByteVector.class
> jersey/repackaged/org/objectweb/asm/ClassReader.class
> jersey/repackaged/org/objectweb/asm/ClassVisitor.class
> jersey/repackaged/org/objectweb/asm/ClassWriter.class
> jersey/repackaged/org/objectweb/asm/Context.class
> jersey/repackaged/org/objectweb/asm/Edge.class
> jersey/repackaged/org/objectweb/asm/FieldVisitor.class
> jersey/repackaged/org/objectweb/asm/FieldWriter.class
> jersey/repackaged/org/objectweb/asm/Frame.class
> jersey/repackaged/org/objectweb/asm/Handle.class
> jersey/repackaged/org/objectweb/asm/Handler.class
> jersey/repackaged/org/objectweb/asm/Item.class
> jersey/repackaged/org/objectweb/asm/Label.class
> jersey/repackaged/org/objectweb/asm/MethodVisitor.class
> jersey/repackaged/org/objectweb/asm/MethodWriter.class
> jersey/repackaged/org/objectweb/asm/Opcodes.class
> jersey/repackaged/org/objectweb/asm/Type.class
> jersey/repackaged/org/objectweb/asm/TypePath.class
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-14879) Build failure due to failing hadoop-client-check-invariants for hadoop-client-runtime.jar

2017-09-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HADOOP-14879:
--

> Build failure due to failing hadoop-client-check-invariants for 
> hadoop-client-runtime.jar
> -
>
> Key: HADOOP-14879
> URL: https://issues.apache.org/jira/browse/HADOOP-14879
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.1.0
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Blocker
> Fix For: 3.0.0-beta1
>
>
> {noformat}
> [ERROR] Found artifact with unexpected contents: 
> '/.../hadoop-client-modules/hadoop-client-runtime/target/hadoop-client-runtime-3.1.0-SNAPSHOT.jar'
> Please check the following and either correct the build or update
> the allowed list with reasoning.
> javax/
> javax/inject/
> javax/inject/Inject.class
> javax/inject/Named.class
> javax/inject/Provider.class
> javax/inject/Qualifier.class
> javax/inject/Scope.class
> javax/inject/Singleton.class
> jersey/
> jersey/repackaged/
> jersey/repackaged/org/
> jersey/repackaged/org/objectweb/
> jersey/repackaged/org/objectweb/asm/
> jersey/repackaged/org/objectweb/asm/AnnotationVisitor.class
> jersey/repackaged/org/objectweb/asm/AnnotationWriter.class
> jersey/repackaged/org/objectweb/asm/Attribute.class
> jersey/repackaged/org/objectweb/asm/ByteVector.class
> jersey/repackaged/org/objectweb/asm/ClassReader.class
> jersey/repackaged/org/objectweb/asm/ClassVisitor.class
> jersey/repackaged/org/objectweb/asm/ClassWriter.class
> jersey/repackaged/org/objectweb/asm/Context.class
> jersey/repackaged/org/objectweb/asm/Edge.class
> jersey/repackaged/org/objectweb/asm/FieldVisitor.class
> jersey/repackaged/org/objectweb/asm/FieldWriter.class
> jersey/repackaged/org/objectweb/asm/Frame.class
> jersey/repackaged/org/objectweb/asm/Handle.class
> jersey/repackaged/org/objectweb/asm/Handler.class
> jersey/repackaged/org/objectweb/asm/Item.class
> jersey/repackaged/org/objectweb/asm/Label.class
> jersey/repackaged/org/objectweb/asm/MethodVisitor.class
> jersey/repackaged/org/objectweb/asm/MethodWriter.class
> jersey/repackaged/org/objectweb/asm/Opcodes.class
> jersey/repackaged/org/objectweb/asm/Type.class
> jersey/repackaged/org/objectweb/asm/TypePath.class
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-11656) Classpath isolation for downstream clients

2017-09-27 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HADOOP-11656.
--
   Resolution: Done
 Hadoop Flags:   (was: Incompatible change)
Fix Version/s: 3.0.0-beta1

I'm resolving this as Done since all subtasks have been completed. There are 
still a few follow-ons being tracked for 3.0.0 GA.

Many thanks to Sean for driving this forward! Great work!

> Classpath isolation for downstream clients
> --
>
> Key: HADOOP-11656
> URL: https://issues.apache.org/jira/browse/HADOOP-11656
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Blocker
>  Labels: classloading, classpath, dependencies, scripts, shell
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-11656_proposal.md
>
>
> Currently, Hadoop exposes downstream clients to a variety of third party 
> libraries. As our code base grows and matures we increase the set of 
> libraries we rely on. At the same time, as our user base grows we increase 
> the likelihood that some downstream project will run into a conflict while 
> attempting to use a different version of some library we depend on. This has 
> already happened with i.e. Guava several times for HBase, Accumulo, and Spark 
> (and I'm sure others).
> While YARN-286 and MAPREDUCE-1700 provided an initial effort, they default to 
> off and they don't do anything to help dependency conflicts on the driver 
> side or for folks talking to HDFS directly. This should serve as an umbrella 
> for changes needed to do things thoroughly on the next major version.
> We should ensure that downstream clients
> 1) can depend on a client artifact for each of HDFS, YARN, and MapReduce that 
> doesn't pull in any third party dependencies
> 2) only see our public API classes (or as close to this as feasible) when 
> executing user provided code, whether client side in a launcher/driver or on 
> the cluster in a container or within MR.
> This provides us with a double benefit: users get less grief when they want 
> to run substantially ahead or behind the versions we need and the project is 
> freer to change our own dependency versions because they'll no longer be in 
> our compatibility promises.
> Project specific task jiras to follow after I get some justifying use cases 
> written in the comments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-14655) Update httpcore version to 4.4.6

2017-09-22 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HADOOP-14655:
--

Reverted this JIRA from trunk and branch-3.0 per Marton's instructions.

> Update httpcore version to 4.4.6
> 
>
> Key: HADOOP-14655
> URL: https://issues.apache.org/jira/browse/HADOOP-14655
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ray Chiang
>Assignee: Ray Chiang
> Attachments: HADOOP-14655.001.patch
>
>
> Update the dependency
> org.apache.httpcomponents:httpcore:4.4.4
> to the latest (4.4.6).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



2017-09-22 Hadoop 3 release status update

2017-09-22 Thread Andrew Wang
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

2017-09-22

We've had some late breaking blockers related to Docker support that are
delaying the release. We're on a day-by-day slip at this point.



Highlights:

   - I did a successful test create-release earlier this week.

Red flags:

   - Docker work resulted in some last minute blockers

Previously tracked beta1 blockers that have been resolved or dropped:

   - HADOOP-14771 
(hadoop-client
   does not include hadoop-yarn-client): Dropped this from the blocker list as
   it's mainly for documentation purposes
   - HDFS-12247 (Rename AddECPolicyResponse to
   AddErasureCodingPolicyResponse) was committed.

beta1 blockers:

   - YARN-6623  (Add
   support to turn off launching privileged containers in the
   container-executor): This is a newly escalated blocker related to the
   Docker work in YARN. Patch is up but we're still waiting on a commit.
   - HADOOP-14897  (Loosen
   compatibility guidelines for native dependencies): Raised by Chris Douglas,
   Daniel will post a patch soon.

beta1 features:

   - Erasure coding
  - Resolved last must-do for beta1!
  - People are looking more at the flaky tests and nice-to-haves
  - Eddy continues to make improvements to block reconstruction
  codepaths
   - Addressing incompatible changes (YARN-6142 and HDFS-11096)
   - Ray has gone through almost all the YARN protos and thinks we're okay
  to move forwards.
  - I think we'll move forward without this committed, given that Sean
  has run it successfully.
   - Classpath isolation (HADOOP-11656)
  - HADOOP-13917
 (Ensure
  nightly builds run the integration tests for the shaded client):
Sean wants
  to get this in before beta1 if there's time, it's already
catching issues.
  Relies on YETUS-543 which I reviewed, waiting on Allen.
  - HADOOP-14771 might be squeezed in if there's time.
   - Compat guide (HADOOP-13714
   )
  - HADOOP-14897 Above mentioned blocker filed by Chris Douglas.
   - TSv2 alpha 2
   - This was merged, no problems thus far [image: (smile)]

GA features:

   - Resource profiles (Wangda Tan)
  - Merge vote was sent out. Since branch-3.0 has been cut, this can be
  merged to trunk (3.1.0) and then backported once we've completed testing.
   - HDFS router-based federation (Chris Douglas)
   - This is like YARN federation, very separate and doesn't add new APIs,
  run in production at MSFT.
  - If it passes Cloudera internal integration testing, I'm fine
  putting this in for GA.
   - API-based scheduler configuration (Jonathan Hung)
  - Jonathan mentioned that his main goal is to get this in for 2.9.0,
  which seems likely to go out after 3.0.0 GA since there hasn't been any
  serious release planning yet. Jonathan said that delaying this
until 3.1.0
  is fine.
   - YARN native services
  - Still not 100% clear when this will land.


Re: [DISCUSS] moving to Apache Yetus Audience Annotations

2017-09-22 Thread Andrew Wang
Yea, unfortunately I'd say backburner it. This would have been perfect
during alpha.

On Fri, Sep 22, 2017 at 11:14 AM, Sean Busbey <bus...@cloudera.com> wrote:

> I'd refer to it as an incompatible change; we expressly label the
> annotations as IA.Public.
>
> If you think it's too late to get in for 3.0, I can make a jira and put it
> on the back burner for when trunk goes to 4.0?
>
> On Fri, Sep 22, 2017 at 12:49 PM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
>
>> Is this itself an incompatible change? I imagine the bytecode will be
>> different.
>>
>> I think we're too late to do this for beta1 given that I want to cut an
>> RC0 today.
>>
>> On Fri, Sep 22, 2017 at 7:03 AM, Sean Busbey <bus...@cloudera.com> wrote:
>>
>>> When Apache Yetus formed, it started with several key pieces of Hadoop
>>> that
>>> looked reusable. In addition to our contribution testing infra, the
>>> project
>>> also stood up a version of our audience annotations for delineating the
>>> public facing API[1].
>>>
>>> I recently got the Apache HBase community onto the Yetus version of those
>>> annotations rather than their internal fork of the Hadoop ones[2]. It
>>> wasn't pretty, mostly a lot of blind sed followed by spot checking and
>>> reliance on automated tests.
>>>
>>> What do folks think about making the jump ourselves? I'd be happy to work
>>> through things, either as one unreviewable monster or per-module
>>> transitions (though a piece-meal approach might complicate our javadoc
>>> situation).
>>>
>>>
>>> [1]: http://yetus.apache.org/documentation/0.5.0/interface-classi
>>> fication/
>>> [2]: https://issues.apache.org/jira/browse/HBASE-17823
>>>
>>> --
>>> busbey
>>>
>>
>>
>
>
> --
> busbey
>


Re: [DISCUSS] moving to Apache Yetus Audience Annotations

2017-09-22 Thread Andrew Wang
Is this itself an incompatible change? I imagine the bytecode will be
different.

I think we're too late to do this for beta1 given that I want to cut an RC0
today.

On Fri, Sep 22, 2017 at 7:03 AM, Sean Busbey  wrote:

> When Apache Yetus formed, it started with several key pieces of Hadoop that
> looked reusable. In addition to our contribution testing infra, the project
> also stood up a version of our audience annotations for delineating the
> public facing API[1].
>
> I recently got the Apache HBase community onto the Yetus version of those
> annotations rather than their internal fork of the Hadoop ones[2]. It
> wasn't pretty, mostly a lot of blind sed followed by spot checking and
> reliance on automated tests.
>
> What do folks think about making the jump ourselves? I'd be happy to work
> through things, either as one unreviewable monster or per-module
> transitions (though a piece-meal approach might complicate our javadoc
> situation).
>
>
> [1]: http://yetus.apache.org/documentation/0.5.0/interface-classification/
> [2]: https://issues.apache.org/jira/browse/HBASE-17823
>
> --
> busbey
>


[jira] [Resolved] (HADOOP-14655) Update httpcore version to 4.4.6

2017-09-20 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HADOOP-14655.
--
Resolution: Fixed

While it's in the branch, let's leave the JIRA resolved for release notes 
purposes.

> Update httpcore version to 4.4.6
> 
>
> Key: HADOOP-14655
> URL: https://issues.apache.org/jira/browse/HADOOP-14655
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ray Chiang
>Assignee: Ray Chiang
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-14655.001.patch
>
>
> Update the dependency
> org.apache.httpcomponents:httpcore:4.4.4
> to the latest (4.4.6).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



2017-09-19 Hadoop 3 release status update

2017-09-19 Thread Andrew Wang
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

2017-09-19

Sorry for the late update. We're down to one blocker and one EC must do!
Made great progress over the last week and a bit.

We will likely cut RC0 this week.

Highlights:

   - Down to just two blocker issues!

Red flags:

   - HDFS unit tests are quite flaky. Some blockers were filed and then
   resolved or downgraded. More work to do here.

Previously tracked beta1 blockers that have been resolved or dropped:

   - HADOOP-14738  (Remove
   S3N and obsolete bits of S3A; rework docs): Committed!
   - HADOOP-14284  (Shade
   Guava everywhere): We resolved this since we decided it was unnecessary for
   beta1.
   - YARN-7162  (Remove
   XML excludes file format): Robert committed after review from Junping.
   - HADOOP-14847  (Remove
   Guava Supplier and change to java Supplier in AMRMClient and
   AMRMClientAysnc): Committed!
   - HADOOP-14238 
(Rechecking
   Guava's object is not exposed to user-facing API): We dropped this off the
   blocker list in the absence of other known issues
   - HADOOP-14835  (mvn
   site build throws SAX errors): I committed after further discussion and
   review with Sean Mackrory and Allen. Planning to switch to japicmp for
   later releases.
   - HDFS-12218  (Rename
   split EC / replicated block metrics in BlockManager): Committed.


beta1 blockers:

   - HADOOP-14771 
(hadoop-client
   does not include hadoop-yarn-client): This was committed but then reverted
   since it broke the build. Haibo and Sean are actively pressing towards a
   correct fix.


beta1 features:

   - Erasure coding
  - Resolved a number of must-dos
 - HDFS-7859 (fsimage changes) was committed!
 - HDFS-12395 (edit log changes) was also committed!
 - HDFS-12218 is discussed above.
  - Remaining blockers:
 - HDFS-12447 is to refactor some of the fsimage code, Andrew needs
 to review
  - Also been progress cleaning up the flaky unit tests, still more to
  do
   - Addressing incompatible changes (YARN-6142 and HDFS-11096)
   - Ray has gone through almost all the YARN protos and thinks we're okay
  to move forwards.
  - I think we'll move forward without this committed, given that Sean
  has run it successfully.
   - Classpath isolation (HADOOP-11656)
  - We have just HADOOP-14771 left.
   - Compat guide (HADOOP-13714
   )
  - This was committed! Some follow-on work filed for GA.
   - TSv2 alpha 2
   - This was merged, no problems thus far [image: (smile)]

GA features:

   - Resource profiles (Wangda Tan)
  - Merge vote was sent out. Since branch-3.0 has been cut, this can be
  merged to trunk (3.1.0) and then backported once we've completed testing.
   - HDFS router-based federation (Chris Douglas)
   - This is like YARN federation, very separate and doesn't add new APIs,
  run in production at MSFT.
  - If it passes Cloudera internal integration testing, I'm fine
  putting this in for GA.
   - API-based scheduler configuration (Jonathan Hung)
  - Jonathan mentioned that his main goal is to get this in for 2.9.0,
  which seems likely to go out after 3.0.0 GA since there hasn't been any
  serious release planning yet. Jonathan said that delaying this
until 3.1.0
  is fine.
   - YARN native services
  - Still not 100% clear when this will land.


Re: [DISCUSS] Can we make our precommit test robust to dependency changes while staying usable?

2017-09-14 Thread Andrew Wang
On Thu, Sep 14, 2017 at 1:59 PM, Sean Busbey  wrote:

>
>
> On 2017-09-14 15:36, Chris Douglas  wrote:
> > This has gotten bad enough that people are dismissing legitimate test
> > failures among the noise.
> >
> > On Thu, Sep 14, 2017 at 1:20 PM, Allen Wittenauer
> >  wrote:
> > > Someone should probably invest some time into integrating the
> HBase flaky test code a) into Yetus and then b) into Hadoop.
> >
> > What does the HBase flaky test code do? Another extension to
> > test-patch could run all new/modified tests multiple times, and report
> > to JIRA if any run fails.
> >
>
> The current HBase stuff segregates untrusted tests by looking through
> nightly test runs to find things that fail intermittently. We then don't
> include those tests in either nightly or precommit tests. We have a
> different job that just runs the untrusted tests and if they start passing
> removes them from the list.
>
> There's also a project getting used by SOLR called "BeastIT" that goes
> through running parallel copies of a given test a large number of times to
> reveal flaky tests.
>
> Getting either/both of those into Yetus and used here would be a huge
> improvement.
>
> I discussed this on yetus-dev a while back and Allen thought it'd be
non-trivial:

https://lists.apache.org/thread.html/552ad614d1b3d5226a656b60c0108457bcaa1219fb9ad985f8750ba1@%3Cdev.yetus.apache.org%3E

I unfortunately don't have the test-patch.sh expertise to dig into this.

-
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-11 Thread Andrew Wang
Thanks for your consideration Jian, let's track this for GA then.

Best,
Andrew

On Fri, Sep 8, 2017 at 3:02 PM, Jian He <j...@hortonworks.com> wrote:

> Hi Andrew,
>
> At this point, there are no more release blockers including documentations
> from our side - all work done.
> But I agree it is too close to the release, after talking with other team
> members, we are fine to drop  this from beta,
>
> And we want to target this for GA.
> I’m withdrawing this vote and will start afresh vote later for GA.
> Thanks all who voted this effort !
>
> Thanks,
> Jian
>
>
> > On Sep 7, 2017, at 3:59 PM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
> >
> > Hi folks,
> >
> > This vote closes today. I see a -1 from Allen on inclusion in beta1. I
> see
> > there's active fixing going on, but given that we're one week out from
> RC0,
> > I think we should drop this from beta1.
> >
> > Allen, Jian, others, is this reasonable? What release should we retarget
> > this for? I don't have a sense for how much work there is left to do, but
> > as a reminder, we're planning GA for Nov 1st, and 3.1.0 for January.
> >
> > Best,
> > Andrew
> >
> > On Wed, Sep 6, 2017 at 10:19 AM, Jian He <j...@hortonworks.com> wrote:
> >
> >>>  Please correct me if I’m wrong, but the current summary of the
> >> branch, post these changes, looks like:
> >> Sorry for confusion, I was actively writing the formal documentation for
> >> how to use/how it works etc. and will post soon in a few hours.
> >>
> >>
> >>> On Sep 6, 2017, at 10:15 AM, Allen Wittenauer <
> a...@effectivemachines.com>
> >> wrote:
> >>>
> >>>
> >>>> On Sep 5, 2017, at 6:23 PM, Jian He <j...@hortonworks.com> wrote:
> >>>>
> >>>>>If it doesn’t have all the bells and whistles, then it shouldn’t
> >> be on port 53 by default.
> >>>> Sure, I’ll change the default port to not use 53 and document it.
> >>>>>*how* is it getting launched on a privileged port? It sounds like
> >> the expectation is to run “command” as root.   *ALL* of the previous
> >> daemons in Hadoop that needed a privileged port used jsvc.  Why isn’t
> this
> >> one? These questions matter from a security standpoint.
> >>>> Yes, it is running as “root” to be able to use the privileged port.
> The
> >> DNS server is not yet integrated with the hadoop script.
> >>>>
> >>>>> Check the output.  It’s pretty obviously borked:
> >>>> Thanks for pointing out. Missed this when rebasing onto trunk.
> >>>
> >>>
> >>>  Please correct me if I’m wrong, but the current summary of the
> >> branch, post these changes, looks like:
> >>>
> >>>  * A bunch of mostly new Java code that may or may not have
> >> javadocs (post-revert YARN-6877, still working out HADOOP-14835)
> >>>  * ~1/3 of the docs are roadmap/TBD
> >>>  * ~1/3 of the docs are for an optional DNS daemon that has
> >> no end user hook to start it
> >>>  * ~1/3 of the docs are for a REST API that comes from some
> >> undefined daemon (apiserver?)
> >>>  * Two new, but undocumented, subcommands to yarn
> >>>  * There are no docs for admins or users on how to actually
> >> start or use this completely new/separate/optional feature
> >>>
> >>>  How are outside people (e.g., non-branch committers) supposed to
> >> test this new feature under these conditions?
> >>>
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> >>
> >>
>
>


2017-09-07 Hadoop 3 release status update

2017-09-07 Thread Andrew Wang
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

2017-09-07

Slightly early update since I'll be out tomorrow. We're one week out, and
focus is on blocker burndown.

Highlights:

   - 3.1.0 release planning is underway, led by Wangda. Target release date
   is in January.

Red flags:

   - YARN native services merge vote got a -1 for beta1, I recommended we
   drop it from beta1 and retarget for a later release.
   - 11 blockers on the dashboard, one more than last week [image: (sad)]

Previously tracked beta1 blockers that have been resolved or dropped:

   - HADOOP-14826 was duped to HADOOP-14738.
   - YARN-5536  (Multiple
   format support (JSON, etc.) for exclude node file in NM graceful
   decommission with timeout): Downgraded in priority in favor of YARN-7162
   which Robert has posted a patch for.
   - MAPREDUCE-6941 (The default setting doesn't work for MapReduce job): I
   resolved this and Junping confirmed this is fine.


beta1 blockers:

   - HADOOP-14738  (Remove
   S3N and obsolete bits of S3A; rework docs): Steve has been actively revving
   this with our new committer Aaron Fabbri ready to review. The scope has
   expanded from HADOOP-14826, so it's not just a doc update.
   - HADOOP-14284  (Shade
   Guava everywhere): No change since last week. This is an umbrella JIRA.
   - HADOOP-14771 
(hadoop-client
   does not include hadoop-yarn-client): Patch up, needs review, still waiting
   on Busbey. Bharat gave it a review.
   - YARN-7162  (Remove
   XML excludes file format): Robert has posted a patch and is waiting for a
   review.
   - HADOOP-14238 
(Rechecking
   Guava's object is not exposed to user-facing API): Bharat took this up and
   turned it into an umbrella.
  - HADOOP-14847
 (Remove
  Guava Supplier and change to java Supplier in AMRMClient and
  AMRMClientAysnc) Bharat posted a patch on a subtask to fix the
known Guava
  Supplier issue in AMRMClient. Needs a review.
   - HADOOP-14835  (mvn
   site build throws SAX errors): I'm working on this. Debugged it and have a
   proposed patch up, discussing with Allen.
   - HDFS-12218  (Rename
   split EC / replicated block metrics in BlockManager): I'm working on this,
   just need to commit it, already have a +1 from Eddy.


beta1 features:

   - Erasure coding
  - There are three must-dos, all being actively worked on.
  - HDFS-7859 is being actively reviewed and revved by Sammi and Kai
  and Eddy.
  - HDFS-12395 was split out of HDFS-7859 to do the edit log changes.
  - HDFS-12218 is discussed above.
   - Addressing incompatible changes (YARN-6142 and HDFS-11096)
   - Ray and Allen reviewed Sean's HDFS rolling upgrade scripts.
  - Sean did a run through of the HDFS JACC report and it looked fine.
   - Classpath isolation (HADOOP-11656)
  - Sean has retriaged the subtasks and has been posting patches.
   - Compat guide (HADOOP-13714
   )
  - Daniel has been collecting feedback on dev lists, but still needs a
  detailed review of the patch.
   - YARN native services
  - Jian sent out the merge vote, but it's been -1'd for beta1 by
  Allen. I propose we drop this from beta1 scope and retarget.
   - TSv2 alpha 2
   - This was merged, no problems thus far [image: (smile)]

GA features:

   - Resource profiles (Wangda Tan)
  - Merge vote was sent out. Since branch-3.0 has been cut, this can be
  merged to trunk (3.1.0) and then backported once we've completed testing.
   - HDFS router-based federation (Chris Douglas)
   - This is like YARN federation, very separate and doesn't add new APIs,
  run in production at MSFT.
  - If it passes Cloudera internal integration testing, I'm fine
  putting this in for GA.
   - API-based scheduler configuration (Jonathan Hung)
  - Jonathan mentioned that his main goal is to get this in for 2.9.0,
  which seems likely to go out after 3.0.0 GA since there hasn't been any
  serious release planning yet. Jonathan said that delaying this
until 3.1.0
  is fine.


[jira] [Created] (HADOOP-14848) Switch from JDiff to japicmp

2017-09-07 Thread Andrew Wang (JIRA)
Andrew Wang created HADOOP-14848:


 Summary: Switch from JDiff to japicmp
 Key: HADOOP-14848
 URL: https://issues.apache.org/jira/browse/HADOOP-14848
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0-alpha4
Reporter: Andrew Wang


JDiff is old and not maintained. It complicates our build by requiring xerces, 
and also a lot of Maven logic to custom patch it and stitch it up.

japicmp was proposed as a more up-to-date tool that also has a Maven plugin. 
It's also ALv2 which is a nice bonus.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-07 Thread Andrew Wang
Hi folks,

This vote closes today. I see a -1 from Allen on inclusion in beta1. I see
there's active fixing going on, but given that we're one week out from RC0,
I think we should drop this from beta1.

Allen, Jian, others, is this reasonable? What release should we retarget
this for? I don't have a sense for how much work there is left to do, but
as a reminder, we're planning GA for Nov 1st, and 3.1.0 for January.

Best,
Andrew

On Wed, Sep 6, 2017 at 10:19 AM, Jian He  wrote:

> >   Please correct me if I’m wrong, but the current summary of the
> branch, post these changes, looks like:
> Sorry for confusion, I was actively writing the formal documentation for
> how to use/how it works etc. and will post soon in a few hours.
>
>
> > On Sep 6, 2017, at 10:15 AM, Allen Wittenauer 
> wrote:
> >
> >
> >> On Sep 5, 2017, at 6:23 PM, Jian He  wrote:
> >>
> >>> If it doesn’t have all the bells and whistles, then it shouldn’t
> be on port 53 by default.
> >> Sure, I’ll change the default port to not use 53 and document it.
> >>> *how* is it getting launched on a privileged port? It sounds like
> the expectation is to run “command” as root.   *ALL* of the previous
> daemons in Hadoop that needed a privileged port used jsvc.  Why isn’t this
> one? These questions matter from a security standpoint.
> >> Yes, it is running as “root” to be able to use the privileged port. The
> DNS server is not yet integrated with the hadoop script.
> >>
> >>> Check the output.  It’s pretty obviously borked:
> >> Thanks for pointing out. Missed this when rebasing onto trunk.
> >
> >
> >   Please correct me if I’m wrong, but the current summary of the
> branch, post these changes, looks like:
> >
> >   * A bunch of mostly new Java code that may or may not have
> javadocs (post-revert YARN-6877, still working out HADOOP-14835)
> >   * ~1/3 of the docs are roadmap/TBD
> >   * ~1/3 of the docs are for an optional DNS daemon that has
> no end user hook to start it
> >   * ~1/3 of the docs are for a REST API that comes from some
> undefined daemon (apiserver?)
> >   * Two new, but undocumented, subcommands to yarn
> >   * There are no docs for admins or users on how to actually
> start or use this completely new/separate/optional feature
> >
> >   How are outside people (e.g., non-branch committers) supposed to
> test this new feature under these conditions?
> >
>
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>


Re: DISCUSS: Hadoop Compatability Guidelines

2017-09-07 Thread Andrew Wang
There's also the DataNode data directory layout. FS edit logs should also
be included if we're including the fsimage.

Historically we've bumped these in minor and major releases, though I'm not
sure whether precedent supports the practice. It means you can't downgrade,
and features that need metadata changes are often also destabilizing. DN
layout version upgrades are also very time intensive, since it needs to
hardlink all the blocks.

I don't think we can change this policy in the next week, but it's
something to consider post-beta1. Now that we have xattrs, there's less
need for metadata layout changes. If we revive the feature flags effort,
then there's even less need.

Cheers,
Andrew

On Thu, Sep 7, 2017 at 11:13 AM, Daniel Templeton 
wrote:

> Good point.  I think it would be valuable to enumerate the policies around
> the versioned state stores.  We have the three you listed. We should
> probably include the HDFS fsimage in that list.  Any others?
>
> I also want to add a section that clarifies when it's OK to change the
> visibility or audience of an API.
>
> Daniel
>
>
> On 9/5/17 11:04 AM, Arun Suresh wrote:
>
>> Thanks for starting this Daniel.
>>
>> I think we should also add a section for store compatibility (all state
>> stores including RM, NM, Federation etc.). Essentially an explicit policy
>> detailing when is it ok to change the major and minor versions and how it
>> should relate to the hadoop release version.
>> Thoughts ?
>>
>> Cheers
>> -Arun
>>
>>
>> On Tue, Sep 5, 2017 at 10:38 AM, Daniel Templeton 
>> wrote:
>>
>> Good idea.  I should have thought of that. :)  Done.
>>>
>>> Daniel
>>>
>>>
>>> On 9/5/17 10:33 AM, Anu Engineer wrote:
>>>
>>> Could you please attach the PDFs to the JIRA. I think the mailer is
 stripping them off from the mail.

 Thanks
 Anu





 On 9/5/17, 9:44 AM, "Daniel Templeton"  wrote:

 Resending with a broader audience, and reattaching the PDFs.

> Daniel
>
> On 9/4/17 9:01 AM, Daniel Templeton wrote:
>
> All, in prep for Hadoop 3 beta 1 I've been working on updating the
>> compatibility guidelines on HADOOP-13714.  I think the initial doc is
>> more or less complete, so I'd like to open the discussion up to the
>> broader Hadoop community.
>>
>> In the new guidelines, I have drawn some lines in the sand regarding
>> compatibility between releases.  In some cases these lines are more
>> restrictive than the current practices.  The intent with the new
>> guidelines is not to limit progress by restricting what goes into a
>> release, but rather to drive release numbering to keep in line with
>> the reality of the code.
>>
>> Please have a read and provide feedback on the JIRA.  I'm sure there
>> are more than a couple of areas that could be improved.  If you'd
>> rather not read markdown from a diff patch, I've attached PDFs of the
>> two modified docs.
>>
>> Thanks!
>> Daniel
>>
>>
> -
>>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>>>
>>>
>>>
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>


Re: Moving Java Forward Faster

2017-09-07 Thread Andrew Wang
I read Mark Reinhold's blog [1] and overall like this change to release
cadence. It introduces 3-year LTS releases along with the 6-month feature
release cadence.

We'd probably stick with the LTS releases. The blog says they only plan to
support each LTS for 3 years though. I'd like this to instead be 6 years
(or at least more than 3), since there needs to be a period of overlapping
LTS support for migration.

Do we have any pull with the JCP?

[1] https://mreinhold.org/blog/forward-faster

On Thu, Sep 7, 2017 at 8:16 AM, Sean Busbey  wrote:

> ugh. this will be rough for cross-jdk compatibility, unless they update the
> target jre options of javac to support more than the last 2 major versions.
>
> > Question: Does GPL licensing of the JDK/JVM affect us negatively?
>
> Nope. all the openjdk bits we rely on were already going to be under the
> GPLv2 with CE, since the alternative is the Oracle Binary Code License[1],
> which is also in Cat-X[2] but for not being an Open Source license. In any
> case things built for Java are covered under the "platform" exception to
> the Cat-X designation[3], since depending Java is considered unavoidable
> for a java project.
>
>
>
> [1]: http://www.jcp.org/aboutJava/communityprocess/licenses/SE7_RIv2.doc
> [2]: http://apache.org/legal/resolved#category-x as "BCL"
> [3]: http://apache.org/legal/resolved#platform
>
> On Thu, Sep 7, 2017 at 9:29 AM, larry mccay  wrote:
>
> > Interesting.
> > Thanks for sharing this, Allen.
> >
> > Question: Does GPL licensing of the JDK/JVM affect us negatively?
> >
> >
> > On Thu, Sep 7, 2017 at 10:14 AM, Allen Wittenauer <
> > a...@effectivemachines.com>
> > wrote:
> >
> > >
> > >
> > > > Begin forwarded message:
> > > >
> > > > From: "Rory O'Donnell" 
> > > > Subject: Moving Java Forward Faster
> > > > Date: September 7, 2017 at 2:12:45 AM PDT
> > > > To: "strub...@yahoo.de >> Mark Struberg" 
> > > > Cc: rory.odonn...@oracle.com, abdul.kolarku...@oracle.com,
> > > balchandra.vai...@oracle.com, dalibor.to...@oracle.com,
> > bui...@apache.org
> > > > Reply-To: bui...@apache.org
> > > >
> > > > Hi Mark & Gavin,
> > > >
> > > > Oracle is proposing a rapid release model for Java SE going-forward.
> > > >
> > > > The high points are highlighted below, details of the changes can be
> > > found on Mark Reinhold’s blog [1] , OpenJDK discussion email list [2].
> > > >
> > > > Under the proposed release model, after JDK 9, we will adopt a
> strict,
> > > time-based model with a new major release every six months, update
> > releases
> > > every quarter, and a long-term support release every three years.
> > > >
> > > > The new JDK Project will run a bit differently than the past "JDK $N"
> > > Projects:
> > > >
> > > > - The main development line will always be open but fixes,
> > enhancements,
> > > and features will be merged only when they're nearly finished. The main
> > > line will be Feature Complete [3] at all times.
> > > >
> > > > - We'll continue to use the JEP Process [4] for new features and
> other
> > > significant changes. The bar to target a JEP to a specific release
> will,
> > > however, be higher since the work must be Feature Complete in order to
> go
> > > in. Owners of large or risky features will be strongly encouraged to
> > split
> > > such features up into smaller and safer parts, to integrate earlier in
> > the
> > > release cycle, and to publish separate lines of early-access builds
> prior
> > > to integration.
> > > >
> > > > The JDK Updates Project will run in much the same way as the past
> "JDK
> > > $N" Updates Projects, though update releases will be strictly limited
> to
> > > fixes of security issues, regressions, and bugs in newer features.
> > > >
> > > > Related to this proposal, we intend to make a few changes in what we
> > do:
> > > >
> > > > - Starting with JDK 9 we'll ship OpenJDK builds under the GPL [5], to
> > > make it easier for developers to deploy Java applications to cloud
> > > environments. We'll initially publish OpenJDK builds for Linux/x64,
> > > followed later by builds for macOS/x64 and Windows/x64.
> > > >
> > > > - We'll continue to ship proprietary "Oracle JDK" builds, which
> include
> > > "commercial features" [6] such as Java Flight Recorder and Mission
> > Control
> > > [7], under a click-through binary-code license [8]. Oracle will
> continue
> > to
> > > offer paid support for these builds.
> > > >
> > > > - After JDK 9 we'll open-source the commercial features in order to
> > make
> > > the OpenJDK builds more attractive to developers and to reduce the
> > > differences between those builds and the Oracle JDK. This will take
> some
> > > time, but the ultimate goal is to make OpenJDK and Oracle JDK builds
> > > completely interchangeable.
> > > >
> > > > - Finally, for the long term we'll work with other OpenJDK
> contributors
> > > to establish an open build-and-test infrastructure. This will 

[jira] [Resolved] (HADOOP-13998) Merge initial S3guard release into trunk

2017-09-05 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HADOOP-13998.
--
Resolution: Done

Re-resolving per above.

> Merge initial S3guard release into trunk
> 
>
> Key: HADOOP-13998
> URL: https://issues.apache.org/jira/browse/HADOOP-13998
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-13998-001.patch, HADOOP-13998-002.patch, 
> HADOOP-13998-003.patch, HADOOP-13998-004.patch, HADOOP-13998-005.patch
>
>
> JIRA to link in all the things we think are needed for a preview/merge into 
> trunk



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-13998) Merge initial S3guard release into trunk

2017-09-05 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HADOOP-13998:
--

Re-opening to resolve as "Complete" or something, since this code change was 
attributed to the parent JIRA HADOOP-13345 in the commit message.

> Merge initial S3guard release into trunk
> 
>
> Key: HADOOP-13998
> URL: https://issues.apache.org/jira/browse/HADOOP-13998
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-13998-001.patch, HADOOP-13998-002.patch, 
> HADOOP-13998-003.patch, HADOOP-13998-004.patch, HADOOP-13998-005.patch
>
>
> JIRA to link in all the things we think are needed for a preview/merge into 
> trunk



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



2017-09-01 Hadoop 3 release status update

2017-09-01 Thread Andrew Wang
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

2017-09-01

We're two weeks out from beta1, focus is on blocker burndown.

Highlights:

   - S3Guard merged!
   - TSv2 alpha2 merged!
   - branch-3.0 has been cut after discussion on dev lists.

Red flags:

   - 10 blockers on the dashboard, closed and bumped some but new ones
   appeared.
   - Still need to land YARN native services and fix some S3Guard doc
   issues for beta1.
   - Rolling upgrade JIRAs for YARN and HDFS are not making any visible
   progress

Previously tracked beta1 blockers that have been resolved:

   - HADOOP-13363  (Upgrade
   to protobuf 3): I dropped this from beta1 since it's simply not going to
   happen in time.
   - YARN-7076 : This was
   quickly resolved! Thanks Jian, Junping, Jason for the action.
   - YARN-7094  (Document
   that server-side graceful decom is currently not recommended): Patch
   committed!

beta1 blockers:

   - HADOOP-14826  (review
   S3 docs prior to 3.0.0-beta1): New blocker with S3Guard merged. Should just
   be a quick doc update.
   - HADOOP-14284  (Shade
   Guava everywhere): Agreement to shade yarn-client at at HADOOP-14771.
   Shading hadoop-hdfs is still being discussed?
   - HADOOP-14771 
(hadoop-client
   does not include hadoop-yarn-client): Patch up, needs review, waiting on
   Busbey
   - YARN-5536  (Multiple
   format support (JSON, etc.) for exclude node file in NM graceful
   decommission with timeout): We're waiting on input from Junping.
   - MAPREDUCE-6941 (The default setting doesn't work for MapReduce job):
   Ray thinks this is a Won't Fix, waiting on Junping to confirm.
   - HADOOP-14238 (Rechecking Guava's object is not exposed to user-facing
   API): This relates to HADOOP-14771, I left a JIRA comment.

beta1 features:

   - Erasure coding
  - There are three must-dos. Two have patches, one might not be a
  must-do.
  - HDFS-11882 has been revved and reviewed, seems close
  - HDFS-11467 and HDFS-7859 are related, Sammi/Eddy/Kai are
  discussing, Sammi thinks we can still make beta1.
   - Addressing incompatible changes (YARN-6142 and HDFS-11096)
  - Sean has HDFS rolling upgrade scripts up, waiting on Ray to add
  some YARN/MR coverage too.
  - Need to do a final runthrough of the JACC reports for YARN and HDFS.
   - Classpath isolation (HADOOP-11656)
  - Sean has retriaged the subtasks and has been posting patches.
   - Compat guide (HADOOP-13714
   )
  - New patch is up, but needs review. Daniel asked Chris Douglas and
  Steve Loughran.
   - YARN native services
  - Jian sent out the merge vote
   - TSv2 alpha 2
   - This was merged, no problems thus far [image: (smile)]

GA features:

   - Resource profiles (Wangda Tan)
  - Merge vote was sent out. Since branch-3.0 has been cut, this can be
  merged to trunk (3.1.0) and then backported once we've completed testing.
   - HDFS router-based federation (Chris Douglas)
   - This is like YARN federation, very separate and doesn't add new APIs,
  run in production at MSFT.
  - If it passes Cloudera internal integration testing, I'm fine
  putting this in for GA.
   - API-based scheduler configuration (Jonathan Hung)
  - Jonathan mentioned that his main goal is to get this in for 2.9.0,
  which seems likely to go out after 3.0.0 GA since there hasn't been any
  serious release planning yet. Jonathan said that delaying this
until 3.1.0
  is fine.


Heads up: branch-3.0 has been cut, commit here for 3.0.0-beta1

2017-09-01 Thread Andrew Wang
Hi folks,

I've proceeded with the plan from our earlier thread and cut branch-3.0.
The branches and maven versions are now set as follows:

trunk: 3.1.0-SNAPSHOT
branch-3.0: 3.0.0-beta1-SNAPSHOT

branch-2's are still the same.

This means if you want to commit something for beta1, commit it to
branch-3.0 too. Excepting features already committed for beta1 (e.g. EC,
native services, S3Guard, TSv2, YARN federation), please treat branch-3.0
the same as a maintenance release branch.

I'm planning to cut the release branch branch-3.0.0-beta1 just before RC.
If you have anything we pushed out of 3.0.0-beta1 and is waiting for 3.0.0
GA, please hold it in trunk until after we release 3.0.0-beta1 (which
should be relatively soon).

Best,
Andrew


Re: [DISCUSS] Branches and versions for Hadoop 3

2017-09-01 Thread Andrew Wang
Hi folks,

We've landed two of our beta1 features, S3Guard and TSv2, into trunk. Jian
just sent out the vote for our remaining beta1 feature, YARN native
services, but I think it's time to branch to unblock the resource profiles
merge to 3.1.

I'll cut just branch-3.0 for now, since we don't have anything urgent that
needs to go into 3.0.0-beta1 vs. 3.0.0 GA.

Cheers,
Andrew

On Tue, Aug 29, 2017 at 11:21 PM, varunsax...@apache.org <
varun.saxena.apa...@gmail.com> wrote:

> Hi Andrew,
>
> We have completed the merge of TSv2 to trunk.
> You can now go ahead with the branching.
>
> Regards,
> Varun Saxena.
>
> On Tue, Aug 29, 2017 at 11:35 PM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
>
>> Sure. Ping me when the TSv2 goes in, and I can take care of branching.
>>
>> We're still waiting on the native services and S3Guard merges, but I
>> don't want to hold branching to the last minute.
>>
>> On Tue, Aug 29, 2017 at 10:51 AM, Vrushali C <vrushalic2...@gmail.com>
>> wrote:
>>
>>> Hi Andrew,
>>> As Rohith mentioned, if you are good with it, from the TSv2 side, we are
>>> ready to go for merge tonight itself (Pacific time)  right after the voting
>>> period ends. Varun Saxena has been diligently rebasing up until now so most
>>> likely our merge should be reasonably straightforward.
>>>
>>> @Wangda: your resource profile vote ends tomorrow, could we please
>>> coordinate our merges?
>>>
>>> thanks
>>> Vrushali
>>>
>>>
>>> On Mon, Aug 28, 2017 at 10:45 PM, Rohith Sharma K S <
>>> rohithsharm...@apache.org> wrote:
>>>
>>>> On 29 August 2017 at 06:24, Andrew Wang <andrew.w...@cloudera.com>
>>>> wrote:
>>>>
>>>> > So far I've seen no -1's to the branching proposal, so I plan to
>>>> execute
>>>> > this tomorrow unless there's further feedback.
>>>> >
>>>> For on going branch merge threads i.e TSv2, voting will be closing
>>>> tomorrow. Does it end up in merging into trunk(3.1.0-SNAPSHOT) and
>>>> branch-3.0(3.0.0-beta1-SNAPSHOT) ? If so, would you be able to wait for
>>>> couple of more days before creating branch-3.0 so that TSv2 branch merge
>>>> would be done directly to trunk?
>>>>
>>>>
>>>>
>>>> >
>>>> > Regarding the above discussion, I think Jason and I have essentially
>>>> the
>>>> > same opinion.
>>>> >
>>>> > I hope that keeping trunk a release branch means a higher bar for
>>>> merges
>>>> > and code review in general. In the past, I've seen some patches
>>>> committed
>>>> > to trunk-only as a way of passing responsibility to a future user or
>>>> > reviewer. That doesn't help anyone; patches should be committed with
>>>> the
>>>> > intent of running them in production.
>>>> >
>>>> > I'd also like to repeat the above thanks to the many, many
>>>> contributors
>>>> > who've helped with release improvements. Allen's work on
>>>> create-release and
>>>> > automated changes and release notes were essential, as was Xiao's
>>>> work on
>>>> > LICENSE and NOTICE files. I'm also looking forward to Marton's site
>>>> > improvements, which addresses one of the remaining sore spots in the
>>>> > release process.
>>>> >
>>>> > Things have gotten smoother with each alpha we've done over the last
>>>> year,
>>>> > and it's a testament to everyone's work that we have a good
>>>> probability of
>>>> > shipping beta and GA later this year.
>>>> >
>>>> > Cheers,
>>>> > Andrew
>>>> >
>>>> >
>>>>
>>>
>>>
>>
>


[jira] [Reopened] (HADOOP-14674) Correct javadoc for getRandomizedTempPath

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HADOOP-14674:
--

I don't see this showing up in trunk, did it actually get committed?

> Correct javadoc for getRandomizedTempPath
> -
>
> Key: HADOOP-14674
> URL: https://issues.apache.org/jira/browse/HADOOP-14674
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Attachments: HADOOP-14674.001.patch
>
>
> getRandomizedTempPath has incorrect javadoc where the javadoc specifies a 
> parameter to the function however the function doesnt expects one.
> {code}
>   /**
>* Get a temp path. This may or may not be relative; it depends on what the
>* {@link #SYSPROP_TEST_DATA_DIR} is set to. If unset, it returns a path
>* under the relative path {@link #DEFAULT_TEST_DATA_PATH}
>* @param subpath sub path, with no leading "/" character
>* @return a string to use in paths
>*/
>   public static String getRandomizedTempPath() {
> return getTempPath(RandomStringUtils.randomAlphanumeric(10));
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Branch merges and 3.0.0-beta1 scope

2017-08-29 Thread Andrew Wang
Hi Vinod,

On Fri, Aug 25, 2017 at 2:42 PM, Vinod Kumar Vavilapalli  wrote:

> > From a release management perspective, it's *extremely* reasonable to
> block the inclusion of new features a month from the planned release date.
> A typical software development lifecycle includes weeks of feature freeze
> and weeks of code freeze. It is no knock on any developer or any feature to
> say that we should not include something in 3.0.0.
>
>
> We have never followed the ‘typical' lifecycle that I am guessing you are
> referring to. If we are, you'll need to publish some of the following: a
> feature freeze date, blockers-criticals-only-from-now date,
> testing-finish date, documentation-finish date, final release date and so
> on.
>

We discussed this as part of the 3.0 alpha/beta/GA plan. The point of the
extended alpha/beta process was to release on a schedule. Things that
weren't ready could be merged for the next alpha. I also advertised alpha4
as feature complete and beta1 as code complete so we could quickly move on
to GA.


> What we do with Apache releases typically is instead we say ‘this' is
> roughly when we want to release, and roughly what features must land and
> let the rest figure out itself.
>
> We did this too. We defined the original scope for 3.0.0 GA way back when
we started the 3.0.0 release process. I've been writing status updates on
the wiki and tracking targeted features and release blockers throughout.

The target versions of this recent batch of features were not discussed
with me, the release manager, until just recently. After some discussion, I
think we've arrived at a release plan that everyone's happy with. But, I
want to be clear that late-breaking inclusion of additional scope should be
considered the exception rather than the norm. Merging code so close to
release means less time for testing and validation, which means lower
quality releases.

I don't think it's a lot to ask that feature leads shoot an email to the
release manager of their target version. DISCUSS emails right before a
proposed merge VOTE are way too late, it ends up being a fire drill where
we need to scramble on many fronts.


> Neither is right or wrong. If we want to change the process, we should
> communicate as such.
>
> Proposing a feature freeze date on the fly is only going to confuse
> people.
>

> > I've been very open and clear about the goals, schedule, and scope of
> 3.0.0 over the last year plus. The point of the extended alpha process was
> to get all our features in during alpha, and the alpha merge window has
> been open for a year. I'm unmoved by arguments about how long a feature has
> been worked on. None of these were not part of the original 3.0.0 scope,
> and our users have been waiting even longer for big-ticket 3.0 items like
> JDK8 and HDFS EC that were part of the discussed scope.
>
>
> Except our schedule is so fluid (not due to the release management process
> to be fair) that it is hard for folks to plan their features. IIRC, our
> schedule was a GA release beginning of this year. Again, this is not a
> critique of 3.0 release process - I have myself done enough releases to
> know that sticking to a date and herding the crowd has been an extremely
> hard job.
>
>
Schedules have been fluid because we don't know when features are getting
in, and there's an unwillingness to bump features to the next release. The
goal of the 3.x alphas and betas was to break out of this release
anti-pattern, and release on a schedule.

There have been schedule delays during the 3.x alphas, but I'm still proud
that we released 4 alphas in 10 months. I'm doing my best to stick to our
published schedule, and add a beta and GA to that list by EOY.

Best,
Andrew


Re: [DISCUSS] Branches and versions for Hadoop 3

2017-08-29 Thread Andrew Wang
Hi Subru,

Basically we're amending the proposal from the original email in the chain
to also immediately create the branch-3.0.0-beta1 release branch. As
described in my 2017-08-25 wiki update, we're gating the merge of these two
features to branch-3.0 on additional testing,  but this keeps 3.0.0 open
for development.

https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

For completeness, here's what our branches and versions would look like:

trunk: 3.1.0-SNAPSHOT
branch-3.0: 3.0.0-SNAPSHOT
branch-3.0.0-beta1: 3.0.0-beta1-SNAPSHOT
branch-2 and etc: remain as is

Best,
Andrew

On Tue, Aug 29, 2017 at 12:21 PM, Subramaniam V K <subru...@gmail.com>
wrote:

> Andrew,
>
> First up thanks for tirelessly pushing on 3.0 release.
>
> I am confused about your comment on creating 2 branches as my
> understanding of Jason's (and Vinod's) comments are that we defer creating
> branch-3?
>
> IMHO, we should consider creating branch-3 (necessary but not sufficient)
> only when we have:
>
>1. a significant incompatible change.
>2. a new feature that cannot be turned off without affecting core
>components.
>
> In summary, I feel we should follow a lazy rather than eager approach
> towards creating mainline branches.
>
> Thanks,
> Subru
>
>
>
> On Tue, Aug 29, 2017 at 11:45 AM, Wangda Tan <wheele...@gmail.com> wrote:
>
>> Gotcha, make sense, so I will hold commit until you cut the two branches
>> and TSv2 get committed.
>>
>> Thanks,
>> Wangda
>>
>> On Tue, Aug 29, 2017 at 11:25 AM, Andrew Wang <andrew.w...@cloudera.com>
>> wrote:
>>
>> > Hi Wangda,
>> >
>> > I'll cut two branches: branch-3.0 (3.0.0-SNAPSHOT) and
>> branch-3.0.0-beta1
>> > (3.0.0-beta1-SNAPSHOT). This way we can merge GA features to branch-3.0
>> but
>> > not branch-3.0.0-beta1.
>> >
>> > Best,
>> > Andrew
>> >
>> > On Tue, Aug 29, 2017 at 11:18 AM, Wangda Tan <wheele...@gmail.com>
>> wrote:
>> >
>> >> Vrushali,
>> >>
>> >> Sure we can wait TSv2 merged before merge resource profile branch.
>> >>
>> >> Andrew,
>> >>
>> >> My understanding is you're going to cut branch-3.0 for 3.0-beta1, and
>> the
>> >> same branch (branch-3.0) will be used for 3.0-GA as well. So my
>> question
>> >> is, there're several features (TSv2, resource profile, YARN-5734) are
>> >> targeted to merge to 3.0-GA but not 3.0-beta1, which branch we should
>> >> commit to, and when we can commit? Also, similar to 3.0.0-alpha1 to 4,
>> you
>> >> will cut branch-3.0.0-beta1, correct?
>> >>
>> >> Thanks,
>> >> Wangda
>> >>
>> >>
>> >> On Tue, Aug 29, 2017 at 11:05 AM, Andrew Wang <
>> andrew.w...@cloudera.com>
>> >> wrote:
>> >>
>> >>> Sure. Ping me when the TSv2 goes in, and I can take care of branching.
>> >>>
>> >>> We're still waiting on the native services and S3Guard merges, but I
>> >>> don't want to hold branching to the last minute.
>> >>>
>> >>> On Tue, Aug 29, 2017 at 10:51 AM, Vrushali C <vrushalic2...@gmail.com
>> >
>> >>> wrote:
>> >>>
>> >>>> Hi Andrew,
>> >>>> As Rohith mentioned, if you are good with it, from the TSv2 side, we
>> >>>> are ready to go for merge tonight itself (Pacific time)  right after
>> the
>> >>>> voting period ends. Varun Saxena has been diligently rebasing up
>> until now
>> >>>> so most likely our merge should be reasonably straightforward.
>> >>>>
>> >>>> @Wangda: your resource profile vote ends tomorrow, could we please
>> >>>> coordinate our merges?
>> >>>>
>> >>>> thanks
>> >>>> Vrushali
>> >>>>
>> >>>>
>> >>>> On Mon, Aug 28, 2017 at 10:45 PM, Rohith Sharma K S <
>> >>>> rohithsharm...@apache.org> wrote:
>> >>>>
>> >>>>> On 29 August 2017 at 06:24, Andrew Wang <andrew.w...@cloudera.com>
>> >>>>> wrote:
>> >>>>>
>> >>>>> > So far I've seen no -1's to the branching proposal, so I plan to
>> >>>>> execute
>> >>>>> > this tomorrow unless there's further feedback.
>> >>>>> >
>> >&

Re: [DISCUSS] Branches and versions for Hadoop 3

2017-08-29 Thread Andrew Wang
Hi Wangda,

I'll cut two branches: branch-3.0 (3.0.0-SNAPSHOT) and branch-3.0.0-beta1
(3.0.0-beta1-SNAPSHOT). This way we can merge GA features to branch-3.0 but
not branch-3.0.0-beta1.

Best,
Andrew

On Tue, Aug 29, 2017 at 11:18 AM, Wangda Tan <wheele...@gmail.com> wrote:

> Vrushali,
>
> Sure we can wait TSv2 merged before merge resource profile branch.
>
> Andrew,
>
> My understanding is you're going to cut branch-3.0 for 3.0-beta1, and the
> same branch (branch-3.0) will be used for 3.0-GA as well. So my question
> is, there're several features (TSv2, resource profile, YARN-5734) are
> targeted to merge to 3.0-GA but not 3.0-beta1, which branch we should
> commit to, and when we can commit? Also, similar to 3.0.0-alpha1 to 4, you
> will cut branch-3.0.0-beta1, correct?
>
> Thanks,
> Wangda
>
>
> On Tue, Aug 29, 2017 at 11:05 AM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
>
>> Sure. Ping me when the TSv2 goes in, and I can take care of branching.
>>
>> We're still waiting on the native services and S3Guard merges, but I
>> don't want to hold branching to the last minute.
>>
>> On Tue, Aug 29, 2017 at 10:51 AM, Vrushali C <vrushalic2...@gmail.com>
>> wrote:
>>
>>> Hi Andrew,
>>> As Rohith mentioned, if you are good with it, from the TSv2 side, we are
>>> ready to go for merge tonight itself (Pacific time)  right after the voting
>>> period ends. Varun Saxena has been diligently rebasing up until now so most
>>> likely our merge should be reasonably straightforward.
>>>
>>> @Wangda: your resource profile vote ends tomorrow, could we please
>>> coordinate our merges?
>>>
>>> thanks
>>> Vrushali
>>>
>>>
>>> On Mon, Aug 28, 2017 at 10:45 PM, Rohith Sharma K S <
>>> rohithsharm...@apache.org> wrote:
>>>
>>>> On 29 August 2017 at 06:24, Andrew Wang <andrew.w...@cloudera.com>
>>>> wrote:
>>>>
>>>> > So far I've seen no -1's to the branching proposal, so I plan to
>>>> execute
>>>> > this tomorrow unless there's further feedback.
>>>> >
>>>> For on going branch merge threads i.e TSv2, voting will be closing
>>>> tomorrow. Does it end up in merging into trunk(3.1.0-SNAPSHOT) and
>>>> branch-3.0(3.0.0-beta1-SNAPSHOT) ? If so, would you be able to wait for
>>>> couple of more days before creating branch-3.0 so that TSv2 branch merge
>>>> would be done directly to trunk?
>>>>
>>>>
>>>>
>>>> >
>>>> > Regarding the above discussion, I think Jason and I have essentially
>>>> the
>>>> > same opinion.
>>>> >
>>>> > I hope that keeping trunk a release branch means a higher bar for
>>>> merges
>>>> > and code review in general. In the past, I've seen some patches
>>>> committed
>>>> > to trunk-only as a way of passing responsibility to a future user or
>>>> > reviewer. That doesn't help anyone; patches should be committed with
>>>> the
>>>> > intent of running them in production.
>>>> >
>>>> > I'd also like to repeat the above thanks to the many, many
>>>> contributors
>>>> > who've helped with release improvements. Allen's work on
>>>> create-release and
>>>> > automated changes and release notes were essential, as was Xiao's
>>>> work on
>>>> > LICENSE and NOTICE files. I'm also looking forward to Marton's site
>>>> > improvements, which addresses one of the remaining sore spots in the
>>>> > release process.
>>>> >
>>>> > Things have gotten smoother with each alpha we've done over the last
>>>> year,
>>>> > and it's a testament to everyone's work that we have a good
>>>> probability of
>>>> > shipping beta and GA later this year.
>>>> >
>>>> > Cheers,
>>>> > Andrew
>>>> >
>>>> >
>>>>
>>>
>>>
>>
>


Re: [DISCUSS] Branches and versions for Hadoop 3

2017-08-29 Thread Andrew Wang
Sure. Ping me when the TSv2 goes in, and I can take care of branching.

We're still waiting on the native services and S3Guard merges, but I don't
want to hold branching to the last minute.

On Tue, Aug 29, 2017 at 10:51 AM, Vrushali C <vrushalic2...@gmail.com>
wrote:

> Hi Andrew,
> As Rohith mentioned, if you are good with it, from the TSv2 side, we are
> ready to go for merge tonight itself (Pacific time)  right after the voting
> period ends. Varun Saxena has been diligently rebasing up until now so most
> likely our merge should be reasonably straightforward.
>
> @Wangda: your resource profile vote ends tomorrow, could we please
> coordinate our merges?
>
> thanks
> Vrushali
>
>
> On Mon, Aug 28, 2017 at 10:45 PM, Rohith Sharma K S <
> rohithsharm...@apache.org> wrote:
>
>> On 29 August 2017 at 06:24, Andrew Wang <andrew.w...@cloudera.com> wrote:
>>
>> > So far I've seen no -1's to the branching proposal, so I plan to execute
>> > this tomorrow unless there's further feedback.
>> >
>> For on going branch merge threads i.e TSv2, voting will be closing
>> tomorrow. Does it end up in merging into trunk(3.1.0-SNAPSHOT) and
>> branch-3.0(3.0.0-beta1-SNAPSHOT) ? If so, would you be able to wait for
>> couple of more days before creating branch-3.0 so that TSv2 branch merge
>> would be done directly to trunk?
>>
>>
>>
>> >
>> > Regarding the above discussion, I think Jason and I have essentially the
>> > same opinion.
>> >
>> > I hope that keeping trunk a release branch means a higher bar for merges
>> > and code review in general. In the past, I've seen some patches
>> committed
>> > to trunk-only as a way of passing responsibility to a future user or
>> > reviewer. That doesn't help anyone; patches should be committed with the
>> > intent of running them in production.
>> >
>> > I'd also like to repeat the above thanks to the many, many contributors
>> > who've helped with release improvements. Allen's work on create-release
>> and
>> > automated changes and release notes were essential, as was Xiao's work
>> on
>> > LICENSE and NOTICE files. I'm also looking forward to Marton's site
>> > improvements, which addresses one of the remaining sore spots in the
>> > release process.
>> >
>> > Things have gotten smoother with each alpha we've done over the last
>> year,
>> > and it's a testament to everyone's work that we have a good probability
>> of
>> > shipping beta and GA later this year.
>> >
>> > Cheers,
>> > Andrew
>> >
>> >
>>
>
>


Re: [DISCUSS] Branches and versions for Hadoop 3

2017-08-28 Thread Andrew Wang
So far I've seen no -1's to the branching proposal, so I plan to execute
this tomorrow unless there's further feedback.

Regarding the above discussion, I think Jason and I have essentially the
same opinion.

I hope that keeping trunk a release branch means a higher bar for merges
and code review in general. In the past, I've seen some patches committed
to trunk-only as a way of passing responsibility to a future user or
reviewer. That doesn't help anyone; patches should be committed with the
intent of running them in production.

I'd also like to repeat the above thanks to the many, many contributors
who've helped with release improvements. Allen's work on create-release and
automated changes and release notes were essential, as was Xiao's work on
LICENSE and NOTICE files. I'm also looking forward to Marton's site
improvements, which addresses one of the remaining sore spots in the
release process.

Things have gotten smoother with each alpha we've done over the last year,
and it's a testament to everyone's work that we have a good probability of
shipping beta and GA later this year.

Cheers,
Andrew

On Mon, Aug 28, 2017 at 3:48 PM, Colin McCabe  wrote:

> On Mon, Aug 28, 2017, at 14:22, Allen Wittenauer wrote:
> >
> > > On Aug 28, 2017, at 12:41 PM, Jason Lowe  wrote:
> > >
> > > I think this gets back to the "if it's worth committing" part.
> >
> >   This brings us back to my original question:
> >
> >   "Doesn't this place an undue burden on the contributor with the
> first incompatible patch to prove worthiness?  What happens if it is
> decided that it's not good enough?"
>
> I feel like this line of argument is flawed by definition.  "What
> happens if the patch isn't worth breaking compatibility over"?  Then we
> shouldn't break compatibility over it.  We all know that most
> compatibility breaks are avoidable with enough effort.  And it's an
> effort we should make, for the good of our users.
>
> Most useful features can be implemented without compatibility breaks.
> And for the few that truly can't, the community should surely agree that
> it's worth breaking compatibility before we do it.  If it's a really
> cool feature, that approval will surely not be hard to get (I'm tempted
> to quote your earlier email about how much we love features...)
>
> >
> >   The answer, if I understand your position, is then at least a
> maybe leaning towards yes: a patch that prior to this branching policy
> change that  would have gone in without any notice now has a higher burden
> (i.e., major feature) to prove worthiness ... and in the process eliminates
> a whole class of contributors and empowers others. Thus my concern ...
> >
> > > As you mentioned, people are already breaking compatibility left and
> right as it is, which is why I wondered if it was really any better in
> practice.  Personally I'd rather find out about a major breakage sooner
> than later, since if trunk remains an active area of development at all
> times it's more likely the community will sit up and take notice when
> something crazy goes in.  In the past, trunk was not really an actively
> deployed area for over 5 years, and all sorts of stuff went in without
> people really being aware of it.
> >
> >   Given the general acknowledgement that the compatibility
> guidelines are mostly useless in reality, maybe the answer is really that
> we're doing releases all wrong.  Would it necessarily be a bad thing if we
> moved to a model where incompatible changes gradually released instead of
> one big one every seven?
>
> I haven't seen anyone "acknowledge that... compatibility guidelines are
> mostly useless"... even you.  Reading your posts from the past, I don't
> get that impression.  On the contrary, you are often upset about
> compatibility breakages.
>
> What would be positive about allowing compatibility breaks in minor
> releases?  Can you give a specific example of what would be improved?
>
> >
> >   Yes, I lived through the "walking on glass" days at Yahoo! and
> realize what I'm saying.  But I also think the rate of incompatible changes
> has slowed tremendously.  Entire groups of APIs aren't getting tossed out
> every week anymore.
> >
> > > It sounds like we agree on that part but disagree on the specifics of
> how to help trunk remain active.
> >
> >   Yup, and there is nothing wrong with that. ;)
> >
> > >  Given that historically trunk has languished for years I was hoping
> this proposal would help reduce the likelihood of it happening again.  If
> we eventually decide that cutting branch-3 now makes more sense then I'll
> do what I can to make that work well, but it would be good to see concrete
> proposals on how to avoid the problems we had with it over the last 6 years.
> >
> >
> >   Yup, agree. But proposals rarely seem to get much actual traction.
> (It's kind of fun reading the Hadoop bylaws and compatibility guidelines
> and old [VOTE] threads to realize 

2017-08-25 Hadoop 3 release status update

2017-08-25 Thread Andrew Wang
Hi all,

I've written up a status report for the current state of Hadoop 3 on the
wiki. I've also pasted it below for your convenience.

Cheers,
Andrew

https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

2017-08-25

Another month flew by without an update. This is a big one.

Red flags:

   - 11 blockers still on the dashboard, with some filed recently. Need to
   burn these down.
   - There are many branch merges proposals flying around for features that
   were not originally being tracked for beta1 and GA. Introducing new code
   always comes with risk, so I'm working with the different contributors
   involved to discuss target versions, confirm readiness, and define quality
   bars for merge.

Miscellaneous blockers:

   - HADOOP-14284  (Shade
   Guava everywhere): We have agreement to shade the yarn client JAR. Shading
   hadoop-hdfs is still being discussed.
   - HADOOP-13363  (Upgrade
   to protobuf 3): Waiting on the Guava shading first.
   - YARN-7076 : New
   blocker, we need an assignee.
   - YARN-7094  (Document
   that server-side graceful decom is currently not recommended): Robert has a
   patch up, needs review. This is a stopgap for the old blocker YARN-5464.
   - YARN-5536  (Multiple
   format support (JSON, etc.) for exclude node file in NM graceful
   decommission with timeout): Robert has a proposal that needs to be pushed
   on.

beta1 features:

   - Erasure coding
  - There are three must-dos. Two have patches, one might not be a
  must-do.
  - I pinged the pluggable policy JIRA to see if metadata and API
  compatibility is complete.
   - Addressing incompatible changes (YARN-6142 and HDFS-11096)
  - Sean has HDFS rolling upgrade scripts up, waiting on Ray to add
  some YARN/MR coverage too.
  - Need to do a final runthrough of the JACC reports for YARN and HDFS.
   - Classpath isolation (HADOOP-11656)
  - We're down to the wire on this, I pinged Sean for an update.
   - Compat guide (HADOOP-13714
   )
  - I pinged the JIRA on this too, no updated patch since May

Features under discussion:

I discussed with a number of lead contributors on these features that were
previously not on my radar.

3.0.0-beta1:

   - YARN native services (Jian He)
  - I was convinced that this is very separate from the core. I'll get
  someone from Cloudera to run it through our integration tests to
verify it
  doesn't break anything downstream, then happy to merge.
   - TSv2 alpha 2 (Vrushali C)
   - Despite being called "alpha 2", this is more like "beta" in terms of
  readiness. Twitter is planning to roll it out to production. Seems quite
  done.
  - I double checked with Haibo, and he successfully ran it through our
  internal integration testing.

3.0.0 GA:

   - Resource profiles (Wangda Tan)
  - Alpha feature, APIs are not stable yet. Has some compatible PB
  changes, will verify rolling upgrade from branch-2. Touches some
core parts
  of YARN.
  - Decided that it's too close to beta1 for this, we're going to test
  it a lot and make sure it's ready for 3.0.0 GA.
   - HDFS router-based federation (Chris Douglas)
   - This is like YARN federation, very separate and doesn't add new APIs,
  run in production at MSFT.
  - If it passes Cloudera internal integration testing, I'm fine
  putting this in for GA.

3.1.0:

   - Storage Policy Satisfier (Uma Gangumalla)
  - We're resolving some design discussions on JIRA. Plan is to do some
  MVP work on the API to get this into 3.1, and if we're happy with the
  second phase, consider for 3.0 GA.
   - HDFS tiered storage (Chris Douglas):
   - This touches some core stuff, and the write path is still being worked
  on. Still somewhat useful with just the read path. Targeting at
3.1.0 gives
  enough time to wrap this up.


Re: Branch merges and 3.0.0-beta1 scope

2017-08-25 Thread Andrew Wang
Jonathan, thanks for the heads up. I don't have much familiarity with YARN,
but gave the PBs and pom changes a look, and left a few small comments on
the umbrella JIRA.

This seems like a smaller change than some of the other branch merges we're
discussing, but I'm again reticent about adding scope if we can avoid it.

In your mind, is this truly a "must-have" for 3.0? It looks compatible, and
thus something we could add in a minor release like 2.9 or 3.1.

Best,
Andrew

On Fri, Aug 25, 2017 at 12:31 PM, Jonathan Hung <jyhung2...@gmail.com>
wrote:

> Hi Andrew,
>
> Thanks for starting the discussion - we have a feature YARN-5734 for API
> based scheduler configuration that I feel is pretty close to merge (also "a
> few weeks"). It's almost completely code and API additions and we were
> careful to design it so that it's compatible (feature is also turned off by
> default). Hoping to get this in before 3.0.0-GA. Just wanted to send this
> note so that we are not caught off guard by this feature.
>
> Thanks!
>
>
> Jonathan Hung
>
> On Fri, Aug 25, 2017 at 11:06 AM, Wangda Tan <wheele...@gmail.com> wrote:
>
>> Resource profile is similar to TSv2, the feature is:
>> - Alpha feature, we will not freeze new added APIs. And all added APIs are
>> explicitly marked to @Unstable.
>> - Allow rolling upgrade from branch-2.
>> - Touched existing code, but we have, and will continue tests to make sure
>> changes are safe.
>>
>> Discussed with Andrew offline, we decided to not put this to beta1 since
>> beta1 is not far away. But we want to put it before GA if sufficient tests
>> are done.
>>
>> Thanks,
>> Wangda
>>
>>
>>
>> On Fri, Aug 25, 2017 at 10:54 AM, Rohith Sharma K S <
>> rohithsharm...@apache.org> wrote:
>>
>> > On 25 August 2017 at 22:39, Andrew Wang <andrew.w...@cloudera.com>
>> wrote:
>> >
>> > > Hi Rohith,
>> > >
>> > > Given that we're advertising TSv2 as an alpha feature, I think we're
>> > > allowed to break compatibility. Let's make sure this is clear in the
>> > > release notes and documentation.
>> > >
>> >
>> > > That said, with TSv2 phase 2, is the API going to be frozen? The
>> umbrella
>> > > JIRA refers to "TSv2 alpha2" which indicated to me it was still
>> > alpha-level
>> > > quality and stability.
>> > >
>> > YES, We have decided to freeze API's. I do not think we make any
>> > compatibility break in future.
>> >
>> >
>> >
>> > >
>> > > Best,
>> > > Andrew
>> > >
>> >
>>
>
>


Re: Branch merges and 3.0.0-beta1 scope

2017-08-25 Thread Andrew Wang
Here's a summary of some 1-on-1 conversations I had with contributors of
the different features I'm tracking.

Storage Policy Satisfier (Uma Gangumalla)
* Target version: 3.1.0, maybe 3.0.0 GA
* We're resolving some design discussions on JIRA. Plan is to do some MVP
work on the API to get this into 3.1, and if we're happy with the second
phase, consider for 3.0 GA.

YARN native services (Jian He)
* Target version: 3.0.0-beta1 as an alpha feature
* I was convinced that this is very separate from the core. I'll get
someone from Cloudera to run it through our integration tests to verify it
doesn't break anything downstream, then happy to merge.

Resource profiles (Wangda Tan)
* Target version: 3.0.0 GA
* Already provided update above, we're going to test it a lot and target
for GA.

HDFS router-based federation (Chris Douglas)
* Target version: 3.0.0 GA
* This is like YARN federation, very separate and doesn't add new APIs, run
in production.
* If it passes our internal integration testing, I'm fine putting this in
late.

HDFS tiered storage (Chris Douglas):
* Target version: 3.1.0
* This touches some core stuff, and the write path is still being worked
on. Still somewhat useful with just the read path. Targeting at 3.1.0 gives
enough time to wrap this up.

TSv2 phase 2 (Vrushali C)
* Target version: 3.0.0-beta1
* This is more like "beta" in terms of readiness, Twitter is planning to
roll it out to production.
* I double checked with Haibo, and he successfully ran it through our
internal integration testing.

Thanks to everyone for meeting with me on short notice, and being very
reasonable about target versions and quality bars. If I mischaracterized
any of our discussions, please reach out or comment.

The branching and versioning discussion is still proceeding. I'd ask that
the pending merge VOTEs watch this carefully; I'm hoping to resolve the
discussion and branch before the VOTEs close, but let's make sure the
branches and versions are ready before doing the actual merges.

Thanks,
Andrew

On Fri, Aug 25, 2017 at 11:06 AM, Wangda Tan <wheele...@gmail.com> wrote:

> Resource profile is similar to TSv2, the feature is:
> - Alpha feature, we will not freeze new added APIs. And all added APIs are
> explicitly marked to @Unstable.
> - Allow rolling upgrade from branch-2.
> - Touched existing code, but we have, and will continue tests to make sure
> changes are safe.
>
> Discussed with Andrew offline, we decided to not put this to beta1 since
> beta1 is not far away. But we want to put it before GA if sufficient tests
> are done.
>
> Thanks,
> Wangda
>
>
>
> On Fri, Aug 25, 2017 at 10:54 AM, Rohith Sharma K S <
> rohithsharm...@apache.org> wrote:
>
>> On 25 August 2017 at 22:39, Andrew Wang <andrew.w...@cloudera.com> wrote:
>>
>> > Hi Rohith,
>> >
>> > Given that we're advertising TSv2 as an alpha feature, I think we're
>> > allowed to break compatibility. Let's make sure this is clear in the
>> > release notes and documentation.
>> >
>>
>> > That said, with TSv2 phase 2, is the API going to be frozen? The
>> umbrella
>> > JIRA refers to "TSv2 alpha2" which indicated to me it was still
>> alpha-level
>> > quality and stability.
>> >
>> YES, We have decided to freeze API's. I do not think we make any
>> compatibility break in future.
>>
>>
>>
>> >
>> > Best,
>> > Andrew
>> >
>>
>
>


[DISCUSS] Branches and versions for Hadoop 3

2017-08-25 Thread Andrew Wang
Hi folks,

With 3.0.0-beta1 fast approaching, I wanted to go over the proposed
branching strategy.

In the early 2.x days, moving trunk immediately to 3.0.0 was a mistake.
branch-2 and trunk were virtually identical, and increased backport
complexity. Until we need to make incompatible changes, there's no need for
a Hadoop 4.0 version.

Thus, here's a proposal of branches and versions:

trunk: 3.1.0-SNAPSHOT
branch-3.0: 3.0.0-beta1-SNAPSHOT
branch-2 and etc: remain as is

LMK questions/comments/etc. Appreciate your attentiveness; I'm hoping to
build consensus quickly since we have a number of open VOTEs for branch
merges.

Thanks,
Andrew


Re: Branch merges and 3.0.0-beta1 scope

2017-08-25 Thread Andrew Wang
Hi Jason,

I agree with this proposal. I'll start another email thread spelling this
out, and gather additional feedback.

Best,
Andrew

On Fri, Aug 25, 2017 at 6:27 AM, Jason Lowe <jl...@oath.com> wrote:

> Andrew Wang wrote:
>
>
>> This means I'll cut branch-3 and
>> branch-3.0, and move trunk to 4.0.0 before these VOTEs end. This will open
>> up development for Hadoop 3.1.0 and 4.0.0.
>
>
> I can see a need for branch-3.0, but please do not create branch-3.  Doing
> so will relegate trunk back to the "patch purgatory" branch, a place where
> patches won't see a release for years.  Unless something is imminently
> going in that will break backwards compatibility and warrant a new 4.x
> release, I don't see the need to distinguish trunk from the 3.x line.
> Leaving trunk as the 3.x line means less branches to commit patches through
> and more testing of every patch since trunk would remain an active area for
> testing and releasing.  If we separate trunk and branch-3 then it's almost
> certain only-trunk patches will start to accumulate and never get any
> "real" testing until someone eventually decides it's time to go to Hadoop
> 4.x.  Looking back at trunk-as-3.x for an example, patches committed there
> in the early days after branch-2 was cut didn't see a release for almost 6
> years.
>
> My apologies if I've missed a feature that is just going to miss the 3.0
> release and will break compatibility when it goes in.  If so then we need
> to cut branch-3, but if not then here's my plea to hold off until we do
> need it.
>
> Jason
>
>
> On Thu, Aug 24, 2017 at 3:33 PM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
>
>> Glad to see the discussion continued in my absence :)
>>
>> From a release management perspective, it's *extremely* reasonable to
>> block
>> the inclusion of new features a month from the planned release date. A
>> typical software development lifecycle includes weeks of feature freeze
>> and
>> weeks of code freeze. It is no knock on any developer or any feature to
>> say
>> that we should not include something in 3.0.0.
>>
>> I've been very open and clear about the goals, schedule, and scope of
>> 3.0.0
>> over the last year plus. The point of the extended alpha process was to
>> get
>> all our features in during alpha, and the alpha merge window has been open
>> for a year. I'm unmoved by arguments about how long a feature has been
>> worked on. None of these were not part of the original 3.0.0 scope, and
>> our
>> users have been waiting even longer for big-ticket 3.0 items like JDK8 and
>> HDFS EC that were part of the discussed scope.
>>
>> I see that two VOTEs have gone out since I was out. I still plan to follow
>> the proposal in my original email. This means I'll cut branch-3 and
>> branch-3.0, and move trunk to 4.0.0 before these VOTEs end. This will open
>> up development for Hadoop 3.1.0 and 4.0.0.
>>
>> I'm reaching out to the lead contributor of each of these features
>> individually to discuss. We need to close on this quickly, and email is
>> too
>> low bandwidth at this stage.
>>
>> Best,
>> Andrew
>>
>
>


Re: Branch merges and 3.0.0-beta1 scope

2017-08-25 Thread Andrew Wang
Hi Rohith,

Given that we're advertising TSv2 as an alpha feature, I think we're
allowed to break compatibility. Let's make sure this is clear in the
release notes and documentation.

That said, with TSv2 phase 2, is the API going to be frozen? The umbrella
JIRA refers to "TSv2 alpha2" which indicated to me it was still alpha-level
quality and stability.

Best,
Andrew

On Thu, Aug 24, 2017 at 11:47 PM, Rohith Sharma K S <
rohithsharm...@apache.org> wrote:

> Hi Andrew
>
> Thanks for update on release plan!
>
> I would like to discuss specifically regarding compatibility of releases.
> What is the compatibility to be maintained for GA if we don't merge to
> beta1 release? IIUC, till now all the releases were alpha where
> compatibility was not that important. All the public interfaces are
> subjected to modifications. Once we release beta, compatibility would be a
> matter.
> During this gap i.e between beta-GA release, should we maintain
> compatibility ?
> If my understanding is right then TSv2 have to be merged with beta1
> release. In TSv2 phase-2, we have compatibility changes from phase-1.
>
>
> Thanks & Regards
> Rohith Sharma K S
>
> On 25 August 2017 at 02:03, Andrew Wang <andrew.w...@cloudera.com> wrote:
>
> > Glad to see the discussion continued in my absence :)
> >
> > From a release management perspective, it's *extremely* reasonable to
> block
> > the inclusion of new features a month from the planned release date. A
> > typical software development lifecycle includes weeks of feature freeze
> and
> > weeks of code freeze. It is no knock on any developer or any feature to
> say
> > that we should not include something in 3.0.0.
> >
> > I've been very open and clear about the goals, schedule, and scope of
> 3.0.0
> > over the last year plus. The point of the extended alpha process was to
> get
> > all our features in during alpha, and the alpha merge window has been
> open
> > for a year. I'm unmoved by arguments about how long a feature has been
> > worked on. None of these were not part of the original 3.0.0 scope, and
> our
> > users have been waiting even longer for big-ticket 3.0 items like JDK8
> and
> > HDFS EC that were part of the discussed scope.
> >
> > I see that two VOTEs have gone out since I was out. I still plan to
> follow
> > the proposal in my original email. This means I'll cut branch-3 and
> > branch-3.0, and move trunk to 4.0.0 before these VOTEs end. This will
> open
> > up development for Hadoop 3.1.0 and 4.0.0.
> >
> > I'm reaching out to the lead contributor of each of these features
> > individually to discuss. We need to close on this quickly, and email is
> too
> > low bandwidth at this stage.
> >
> > Best,
> > Andrew
> >
>


Re: Branch merges and 3.0.0-beta1 scope

2017-08-24 Thread Andrew Wang
Glad to see the discussion continued in my absence :)

>From a release management perspective, it's *extremely* reasonable to block
the inclusion of new features a month from the planned release date. A
typical software development lifecycle includes weeks of feature freeze and
weeks of code freeze. It is no knock on any developer or any feature to say
that we should not include something in 3.0.0.

I've been very open and clear about the goals, schedule, and scope of 3.0.0
over the last year plus. The point of the extended alpha process was to get
all our features in during alpha, and the alpha merge window has been open
for a year. I'm unmoved by arguments about how long a feature has been
worked on. None of these were not part of the original 3.0.0 scope, and our
users have been waiting even longer for big-ticket 3.0 items like JDK8 and
HDFS EC that were part of the discussed scope.

I see that two VOTEs have gone out since I was out. I still plan to follow
the proposal in my original email. This means I'll cut branch-3 and
branch-3.0, and move trunk to 4.0.0 before these VOTEs end. This will open
up development for Hadoop 3.1.0 and 4.0.0.

I'm reaching out to the lead contributor of each of these features
individually to discuss. We need to close on this quickly, and email is too
low bandwidth at this stage.

Best,
Andrew


  1   2   3   4   5   6   >