Re: About supporting upcoming java versions

2018-11-07 Thread J. Rottinghuis
Not to throw fuel on the fire so to say, nor to make any statement about
not wanting anybody to spend time on JDK 9 or 10, but our general thinking
at Twitter is that we'll skip over these versions as well and move straight
to JDK 11 as well.

That said, this is still a bit of an aspiration for us rather than
something we're working on right away in the Hadoop team (there is some
other tech-debt to iron out first before we will get to that)

Cheers,

Joep

On Wed, Nov 7, 2018 at 2:18 AM Steve Loughran 
wrote:

>
> If there are problems w/ JDK11 then we should be talking to oracle about
> them to have them fixed. Is there an ASF JIRA on this issue yet?
>
> As usual, the large physical clusters will be slow to upgrade,
>
> but the smaller cloud ones can get away with being agile, and as I believe
> that YARN does let you run code with a different path to the jvm, people
> can mix things.
> This makes it possible for people to run java 11+ apps even if hadoop
> itself is on java 8.
>
> And this time we may want to think about: which release we declare "ready
> for Java 11", being proactive rather than lagging behind the public
> releases by many years (6=>7, 7=>8). Of course, we'll have to stay with the
> java 8 language for a while, but there's a lot more we can do there in our
> code. I'm currently (HADOOP-14556) embracing Optional, as it makes explicit
> when things are potentially null, and while its  crippled by the java
> language itself (
> http://steveloughran.blogspot.com/2018/10/javas-use-of-checked-exceptions.html
> ), its still something we can embrace (*)
>
>
> Takanobu,
>
> I've been watching the work you, Akira and others have been putting in for
> java 9+ support and its wonderful, If we had an annual award for
> "persevering in the presence of extreme suffering" it'd be the top
> candidate for this year's work.
>
> it means we are lined up to let people run on Hadoop 11 if they want, and
> gives that option of moving to java 11 sooner rather than later. I'm also
> looking at JUnit 5, wondering when I can embrace it fully (i.e. not worry
> about cherry picking code into junit 4 tests)
>
> Thanks for all your work
>
> -Steve
>
> (*) I also have in the test code of that branch a bonding of UG.doAs which
> takes closures
>
>
> https://github.com/steveloughran/hadoop/blob/s3/HADOOP-14556-delegation-token/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/LambdaTestUtils.java#L865
>
>
> lets me do things like
>
> assertEquals("FS username in doAs()",
> ALICE,
> doAs(bobUser, () -> fs.getUsername()))
>
> If someone wants to actually pull this support into UGI itself, happy to
> review. as moving our doAs code to things like bobUser.doAs(() ->
> fs.create(path)) will transform all those UGI code users.
>
> On 6 Nov 2018, at 05:57, Takanobu Asanuma  tasan...@apache.org>> wrote:
>
> Thanks for your reply, Owen.
>
> That said, I’d be surprised if the work items for JDK 9 and 10 aren’t a
> strict subset of the issues getting to JDK 11.
>
> Most of the issues that we have fixed are subset of the ones of JDK 11. But
> there seem to be some exceptions. HADOOP-15905 is a bug of JDK 9/10 which
> has been fixed in JDK 11. It is difficult to fix it since JDK 9/10 have
> already been EOL. I wonder if we should treat such a kind of error going
> forward.
>
> I've hit at least one pretty serious JVM bug in JDK 11
> Could you please share that detail?
>
> In any case, we should be carefully that what version of Hadoop is ready
> for JDK 11. It will take some time yet. And we also need to keep supporting
> JDK 8 for a while.
>
> Regards,
> - Takanobu
>
>
>
>


Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-06 Thread J. Rottinghuis
Sorry for jumping in late into the fray of this discussion.

It seems Ozone is a large feature. I appreciate the development effort and
the desire to get this into the hands of users.
I understand the need to iterate quickly and to reduce overhead for
development.
I also agree that Hadoop can benefit from a quicker release cycle. For our
part, this is a challenge as we have a large installation with multiple
clusters and thousands of users. It is a constant balance between jumping
to the newest release and the cost of this integration and test at our
scale, especially when things aren't backwards compatible. We try to be
good citizens and upstream our changes and contribute back.

The point was made that splitting the projects such as common and Yarn
didn't work and had to be reverted. That was painful and a lot of work for
those involved for sure. This project may be slightly different in that
hadoop-common, Yarn and HDFS made for one consistent whole. One couldn't
run a project without the other.

Having a separate block management layer with possibly multiple block
implementation as pluggable under the covers would be a good future
development for HDFS. Some users would choose Ozone as that layer, some
might use S3, others GCS, or Azure, or something else.
If the argument is made that nobody will be able to run Hadoop as a
consistent stack without Ozone, then that would be a strong case to keep
things in the same repo.

Obviously when people do want to use Ozone, then having it in the same repo
is easier. The flipside is that, separate top-level project in the same
repo or not, it adds to the Hadoop releases. If there is a change in Ozone
and a new release needed, it would have to wait for a Hadoop release. Ditto
if there is a Hadoop release and there is an issue with Ozone. The case
that one could turn off Ozone through a Maven profile works only to some
extend.
If we have done a 3.x release with Ozone in it, would it make sense to do a
3.y release with y>x without Ozone in it? That would be weird.

This does sound like a Hadoop 4 feature. Compatibility with lots of new
features in Hadoop 3 need to be worked out. We're still working on jumping
to a Hadoop 2.9 release and then working on getting a step-store release to
3.0 to bridge compatibility issues. I'm afraid that adding a very large new
feature into trunk now, essentially makes going to Hadoop 3 not viable for
quite a while. That would be a bummer for all the feature work that has
gone into Hadoop 3. Encryption and erasure encoding are very appealing
features, especially in light of meeting GDPR requirements.

I'd argue to pull out those pieces that make sense in Hadoop 3, merge those
in and keep the rest in a separate project. Iterate quickly in that
separate project, you can have a separate set of committers, you can do
separate release cycle. If that develops Ozone into _the_ new block layer
for all use cases (even when people want to give up on encryption, erasure
encoding, or feature parity is reached) then we can jump of that bridge
when we reach it. I think adding a very large chunk of code that relatively
few people in the community are familiar with isn't necessarily going to
help Hadoop at this time.

Cheers,

Joep

On Tue, Mar 6, 2018 at 2:32 PM, Jitendra Pandey 
wrote:

> Hi Andrew,
>
>  I think we can eliminate the maintenance costs even in the same repo. We
> can make following changes that incorporate suggestions from Daryn and Owen
> as well.
> 1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate
> directory.
> 2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
> 3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be
> loaded in DN as a pluggable module.
>  If not loaded, there will be absolutely no code path through hdsl or
> ozone.
> 4. To further make it easier for folks building hadoop, we can support a
> maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will
> not be built.
>  For example, Cloudera can choose to skip even compiling/building
> hdsl/ozone and therefore no maintenance overhead whatsoever.
>  HADOOP-14453 has a patch that shows how it can be done.
>
> Arguably, there are two kinds of maintenance costs. Costs for developers
> and the cost for users.
> - Developers: A maven profile as noted in point (3) and (4) above
> completely addresses the concern for developers
>  as there are no compile time dependencies
> and further, they can choose not to build ozone/hdsl.
> - User: Cost to users will be completely alleviated if ozone/hdsl is not
> loaded as mentioned in point (3) above.
>
> jitendra
>
> From: Andrew Wang 
> Date: Monday, March 5, 2018 at 3:54 PM
> To: Wangda Tan 
> Cc: Owen O'Malley , Daryn Sharp
> , Jitendra Pandey ,
> hdfs-dev 

Re: [VOTE] Merge feature branch YARN-5355 (Timeline Service v2) to trunk

2017-08-27 Thread J. Rottinghuis
That makes sense to me, after merge to trunk and branch 2, it probably
makes sense to create a new jira and feature branch for atsv2 additional
feature dev.

Cheers,

Joep

On Fri, Aug 25, 2017 at 3:05 PM, Vrushali C  wrote:

> Hi Subru,
>
> Thanks for your vote and your response!
>
> Regarding your question about merging to branch2, I think, after the merge
> to trunk is done, it will be good to merge to branch-2 as soon as we can.
> A lot of testing with trunk has been done presently and it will be good to
> repeat the same with the YARN-5355_branch2 branch before we start working
> on newer features. That way trunk and branch-2 would be in a similar state
> with respect to timeline service v2 development.
>
> thanks
> Vrushali
>
>
>
>
>
>
>
> On Fri, Aug 25, 2017 at 2:28 PM, varunsax...@apache.org <
> varun.saxena.apa...@gmail.com> wrote:
>
> > Thanks Subru for voting.
> >
> >
> >> What are the timelines you are looking for getting this into branch-2?
> >
> > We haven't yet decided on it and were thinking of discussing this in
> > detail within the team after merge to trunk.
> > The timelines would depend on whether we release whatever we merge to
> > trunk in 2.9 or would we want to get in few other features which people
> > would like to see in 2.9
> > This would require some discussion with the stakeholders.
> > We were thinking of having a short discussion with you guys as well to
> > find out whether there are any further gaps in ATSv2 with respect to
> > federation support and if they can be filled before 2.9 release.
> >
> > Assuming 2.9 is targetted for October end, we would have to start a merge
> > vote in September end or October 1st week which leaves us with very
> little
> > time to take up large changes anyways.
> >
> > We do maintain a branch-2 version of ATSv2(YARN-5355_branch2) though
> which
> > we rebase with branch-2 regularly. So, if we decide to merge in branch-2
> > without any additional changes, we would be able to go for branch-2 merge
> > discussion and vote almost immediately.
> >
> > Regards,
> > Varun Saxena.
> >
> >
> >
> >
> > On Sat, Aug 26, 2017 at 2:00 AM, Subramaniam V K 
> > wrote:
> >
> >> +1 (binding).
> >>
> >> I have been following the effort and had few design discussions around
> the
> >> team especially about how it integrates with Federation. Overall I feel
> >> it's a welcome improvement to YARN.
> >>
> >> What are the timelines you are looking for getting this into branch-2?
> >>
> >> Thanks,
> >> Subru
> >>
> >> On Fri, Aug 25, 2017 at 10:04 AM, Sangjin Lee  wrote:
> >>
> >> > +1 (binding)
> >> >
> >> > I've built the current branch, and checked out a few basic areas
> >> including
> >> > documentation. Also perused the most recent changes that went in.
> >> >
> >> > Thanks much for the great team work! I look forward to seeing it in
> >> action.
> >> >
> >> > Regards,
> >> > Sangjin
> >> >
> >> > On Fri, Aug 25, 2017 at 9:27 AM, Haibo Chen 
> >> > wrote:
> >> >
> >> > > +1 from my side.
> >> > >
> >> > > More from the perspective of ensuring there is no impact of ATSv2
> >> when it
> >> > > is off (by default), I deployed the latest YARN-5355 bits into a few
> >> > > clusters and ran internal Smoke tests. The tests shows no impact
> when
> >> > ATSv2
> >> > > is off.
> >> > >
> >> > > Best,
> >> > > Haibo
> >> > >
> >> > > On Thu, Aug 24, 2017 at 7:51 AM, Sunil G  wrote:
> >> > >
> >> > > > Thank you very much Vrushali, Rohith, Varun and other folks who
> made
> >> > this
> >> > > > happen. Great work, really appreciate the same!!
> >> > > >
> >> > > > +1 (binding) from my side:
> >> > > >
> >> > > > # Tested ATSv2 cluster in a secure cluster. Ran some basic jobs
> >> > > > # Accessed new YARN UI which shows various flows/flow activity
> etc.
> >> > Seems
> >> > > > fine.
> >> > > > # Based on code, looks like all apis are compatible.
> >> > > > # REST api docs looks fine as well, I guess we could improve that
> a
> >> bit
> >> > > > more post merge as well.
> >> > > > # Adding to additional thoughts which are discussed here, native
> >> > service
> >> > > > also could publish events to atsv2. I think that work is also
> >> happened
> >> > in
> >> > > > branch.
> >> > > >
> >> > > > Looking forward to a much wider adoption of ATSv2 with more
> >> projects.
> >> > > >
> >> > > > Thanks
> >> > > > Sunil
> >> > > >
> >> > > >
> >> > > > On Tue, Aug 22, 2017 at 12:02 PM Vrushali Channapattan <
> >> > > > vrushalic2...@gmail.com> wrote:
> >> > > >
> >> > > > > Hi folks,
> >> > > > >
> >> > > > > Per earlier discussion [1], I'd like to start a formal vote to
> >> merge
> >> > > > > feature branch YARN-5355 [2] (Timeline Service v.2) to trunk.
> The
> >> > vote
> >> > > > will
> >> > > > > run for 7 days, and will end August 29 11:00 PM PDT.
> >> > > > >
> >> > > > > We have previously completed one merge onto trunk [3] and
> Timeline
> >> > > > Service
> 

Re: [VOTE] Merge feature branch YARN-5355 (Timeline Service v2) to trunk

2017-08-23 Thread J. Rottinghuis
+1 (non-binding) for the merge

@Vinod I hope that means a +1 from you as well!

Cheers,

Joep

On Tue, Aug 22, 2017 at 11:15 AM, Vinod Kumar Vavilapalli <
vino...@apache.org> wrote:

> Such a great community effort - hats off, team!
>
> Thanks
> +Vinod
>
> > On Aug 21, 2017, at 11:32 PM, Vrushali Channapattan <
> vrushalic2...@gmail.com> wrote:
> >
> > Hi folks,
> >
> > Per earlier discussion [1], I'd like to start a formal vote to merge
> > feature branch YARN-5355 [2] (Timeline Service v.2) to trunk. The vote
> will
> > run for 7 days, and will end August 29 11:00 PM PDT.
> >
> > We have previously completed one merge onto trunk [3] and Timeline
> Service
> > v2 has been part of Hadoop release 3.0.0-alpha1.
> >
> > Since then, we have been working on extending the capabilities of
> Timeline
> > Service v2 in a feature branch [2] for a while, and we are reasonably
> > confident that the state of the feature meets the criteria to be merged
> > onto trunk and we'd love folks to get their hands on it in a test
> capacity
> > and provide valuable feedback so that we can make it production-ready.
> >
> > In a nutshell, Timeline Service v.2 delivers significant scalability and
> > usability improvements based on a new architecture. What we would like to
> > merge to trunk is termed "alpha 2" (milestone 2). The feature has a
> > complete end-to-end read/write flow with security and read level
> > authorization via whitelists. You should be able to start setting it up
> and
> > testing it.
> >
> > At a high level, the following are the key features that have been
> > implemented since alpha1:
> > - Security via Kerberos Authentication and delegation tokens
> > - Read side simple authorization via whitelist
> > - Client configurable entity sort ordering
> > - Richer REST APIs for apps, app attempts, containers, fetching metrics
> by
> > timerange, pagination, sub-app entities
> > - Support for storing sub-application entities (entities that exist
> outside
> > the scope of an application)
> > - Configurable TTLs (time-to-live) for tables, configurable table
> prefixes,
> > configurable hbase cluster
> > - Flow level aggregations done as dynamic (table level) coprocessors
> > - Uses latest stable HBase release 1.2.6
> >
> > There are a total of 82 subtasks that were completed as part of this
> effort.
> >
> > We paid close attention to ensure that once disabled Timeline Service v.2
> > does not impact existing functionality when disabled (by default).
> >
> > Special thanks to a team of folks who worked hard and contributed towards
> > this effort with patches, reviews and guidance: Rohith Sharma K S, Varun
> > Saxena, Haibo Chen, Sangjin Lee, Li Lu, Vinod Kumar Vavilapalli, Joep
> > Rottinghuis, Jason Lowe, Jian He, Robert Kanter, Micheal Stack.
> >
> > Regards,
> > Vrushali
> >
> > [1] http://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg27383.html
> > [2] https://issues.apache.org/jira/browse/YARN-5355
> > [3] https://issues.apache.org/jira/browse/YARN-2928
> > [4] https://github.com/apache/hadoop/commits/YARN-5355
>
>
> -
> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>
>


Re: [DISCUSS] Looking to a 2.9.0 release

2017-07-26 Thread J. Rottinghuis
Thanks Vrushali for being entirely open as to the current status of ATSv2.
I appreciate that we want to ensure things are tested at scale, and as you
said we are working on that right now on our clusters.
We have tested the feature to demonstrate it works at what we consider
moderate scale.

I think the criteria for including this feature in the 2.9 release should
be if it can be safely turned off and not cause impact to anybody not using
the new feature. The confidence for this is high for timeline service v2.

Therefore, I think timeline service v2 should definitely be part of 2.9.
That is the big draw for us to work on stabilizing a 2.9 release rather
than just going to 2.8 and back-porting things ourselves.

Thanks,

Joep

On Tue, Jul 25, 2017 at 11:39 PM, Vrushali Channapattan <
vrushalic2...@gmail.com> wrote:

> Thanks Subru for initiating this discussion.
>
> Wanted to share some thoughts in the context of Timeline Service v2. The
> current status of this module is that we are ramping up for a second merge
> to trunk. We still have a few merge blocker jiras outstanding, which we
> think we will finish soon.
>
> While we have done some testing, we are yet to test at scale. Given all
> this, we were thinking of initially targeting a beta release vehicle rather
> than a stable release.
>
> As such, timeline service v2 has branch-2 branch called as
> YARN-5355-branch-2 in case anyone wants to try it out. Timeline service v2
> can be turned off and should not affect the cluster.
>
> thanks
> Vrushali
>
>
>
>
>
> On Mon, Jul 24, 2017 at 1:26 PM, Subru Krishnan  wrote:
>
> > Folks,
> >
> > With the release for 2.8, we would like to look ahead to 2.9 release as
> > there are many features/improvements in branch-2 (about 1062 commits),
> that
> > are in need of a release vechile.
> >
> > Here's our first cut of the proposal from the YARN side:
> >
> >1. Scheduler improvements (decoupling allocation from node heartbeat,
> >allocation ID, concurrency fixes, LightResource etc).
> >2. Timeline Service v2
> >3. Opportunistic containers
> >4. Federation
> >
> > We would like to hear a formal list from HDFS & Hadoop (& MapReduce if
> any)
> > and will update the Roadmap wiki accordingly.
> >
> > Considering our familiarity with the above mentioned YARN features, we
> > would like to volunteer as the co-RMs for 2.9.0.
> >
> > We want to keep the timeline at 8-12 weeks to keep the release pragmatic.
> >
> > Feedback?
> >
> > -Subru/Arun
> >
>


Re: Feedback on IRC channel

2016-07-14 Thread J. Rottinghuis
Uhm, there is an IRC channel?!?

Joep

On Wed, Jul 13, 2016 at 3:13 PM, Sangjin Lee  wrote:

> I seldom check out IRC (as my experience was the same). I'm OK with
> retiring it if no committers are around.
>
> On a related note, I know Tsuyoshi set up a slack channel for the
> committers. Even that one is pretty idle. :) Should we use it more often?
> If that starts to gain traction, we could set up a more open room for users
> as well.
>
> Sangjin
>
> On Wed, Jul 13, 2016 at 9:13 AM, Karthik Kambatla 
> wrote:
>
> > Recently, Andrew Wang and I were at an academic conference where one of
> the
> > attendees (a grad student) was mentioning that his posts to the IRC
> channel
> > are never answered.
> >
> > Personally, I haven't been using the IRC channel. Neither do I know
> anyone
> > who is actively monitoring it.
> >
> > I am emailing to check:
> >
> >1. Are there folks actively monitoring the IRC channel and answering
> >questions?
> >2. If there is no one, should we consider retiring the channel?
> >
> > Thanks
> > Karthik
> >
>


release notes publishing gap

2015-10-22 Thread J. Rottinghuis
There seems to be a gap in our release notes publishing.

Up to
http://hadoop.apache.org/docs/r2.4.1/
The release notes were updated for each release.

Then these three are the same:
https://hadoop.apache.org/docs/r2.5.2/
http://hadoop.apache.org/docs/r2.6.0/index.html
http://hadoop.apache.org/docs/r2.6.1/index.html

and this one is actually updated again:
http://hadoop.apache.org/docs/r2.7.1/

Is this due to a missed step in the release process?

Cheers,

Joep


Re: Planning Hadoop 2.6.1 release

2015-07-22 Thread J. Rottinghuis
Hi Vinod,

We've gone through the various lists of upstream jiras and wanted to
provide our take on what we'd like to see in 2.6.1 release.
In addition, we have ~58 jiras (some a group of work such as DN maintenance
state) that we already have in production
in a pre-2.6 release. I've gone through and selected those that seem
relevant for 2.6.1.

Jiras already marked with 2.6.1-candidate (we agree should be part of
2.6.1):
HADOOP-11802
HDFS-7489
HDFS-7533
HDFS-7587
HDFS-7596
HDFS-7707
HDFS-7742
HDFS-8046
HDFS-8072
HDFS-8127
MAPREDUCE-6303
MAPREDUCE-6361
YARN-2414
YARN-2905
YARN-3369
YARN-3485
YARN-3585
YARN-3641
HDFS-7443
HDFS-7575


Jiras not yet marked with 2.6.1-candidate that we'd like to see in 2.6.1:
HDFS-7213
HDFS-7446
HDFS-7704
HDFS-7788
HDFS-7884
HDFS-7894
HDFS-7916
HDFS-7929
HDFS-7930
HDFS-7980
HDFS-8245
HDFS-8270
HDFS-8404
HDFS-8480
HDFS-8486
MAPREDUCE-6238
YARN-2856
MAPREDUCE-6300
YARN-2952
YARN-2997
YARN-3094
YARN-3222
YARN-3238
YARN-3464
YARN-3526
YARN-3850

Jiras that we're already running in production pre-2.6 that we'd like in
2.6.1-candidate:

HADOOP-11812 (committed in 2.8)
MAPREDUCE-5649 (committed in 2.8)
YARN-3231 (committed in 2.7)
MAPREDUCE-6166 (committed in 2.7)
HADOOP-11295 (committed in 2.7)
HDFS-7314 (committed in 2.8)
HDFS-7182 (committed in 2.7)
MAPREDUCE-5465 (committed in 2.8)

Lower priority jira that we're already running in production in pre-2.6
that are more a nice to have:
HDFS-7281 (committed in 3.0)
YARN-3176 (not yet committed in OSS)

Note that these lists are a culmination of lots of work by many people in
our team, so credit goes to all of them.
Any possible typo or mistake in copying jira ids are entirely to be blamed
on me.

Thanks,

Joep


On Tue, Jul 21, 2015 at 2:15 AM, Akira AJISAKA ajisa...@oss.nttdata.co.jp
wrote:

 Thanks Vinod for updating the candidate list.
 I'd like to include the followings 12 JIRAs:

 * YARN-3641
 * YARN-3585
 * YARN-2910
 * HDFS-8431
 * HDFS-7830
 * HDFS-7763
 * HDFS-7742
 * HDFS-7235
 * HDFS-7225
 * MAPREDUCE-6324
 * HADOOP-11934
 * HADOOP-11491

 Thanks,
 Akira

 On 7/18/15 11:13, Vinod Kumar Vavilapalli wrote:

   - I also have a bunch of patches that I’d like to include, will update
 them right away.

 I’ve just finished this. The latest 2.6.1-candidate list is up at 64
 JIRAs.

 Others, please look at the list and post anything else you’d like to get
 included for 2.6.1.

 Thanks
 +Vinod


 On Jul 15, 2015, at 6:24 PM, Vinod Kumar Vavilapalli 
 vino...@hortonworks.commailto:vino...@hortonworks.com wrote:

 Alright, I’d like to make progress while the issue is hot.

 I created a label to discuss on the candidate list of patches:
 https://issues.apache.org/jira/issues/?jql=labels%20%3D%202.6.1-candidate
 https://issues.apache.org/jira/issues/?jql=labels%20=%202.6.1-candidate

 Next steps, I’ll do the following
   - Review 2.7 and 2.8 blocker/critical tickets and see what makes sense
 for 2.6.1 and add as candidates
   - I haven’t reviewed the current list yet, the seed list is from this
 email thread. Will review them.
   - I also have a bunch of patches that I’d like to include, will update
 them right away.

 Others, please look at the current list and let me know what else you’d
 like to include.

 I’d like to keep this ‘candidate-collection’ cycle’ for a max of a week
 and then start the release process. @Akira, let’s sync up offline on how to
 take this forward in terms of the release process.

 Thanks
 +Vinod






Re: Hadoop - Major releases

2015-03-16 Thread J. Rottinghuis
Here are some of our thought on the discussions of the past few days with
respect to backwards compatibility.

In general at Twitter we're not necessarily against backwards incompatible
changes per se.











*It depends on the Return on Pain. While it is hard to quantify the
returns in the abstract, I can try to sketch out which kinds of changes are
the most painful and therefore cause the most friction for us.In rough
order of increasing pain to deal with:a) There is a new upstream (3.x)
release, but it is so backwards incompatible, that we won't be able to
adopt it for the foreseeable future. Even though we don’t adopt it, it
still causes pain. Now development becomes that much harder because we'd
have to get a patch for trunk, a patch for 3.x and a patch for the 2.x
branch. Conversely if patches go into 2.x only, now the releases start
drifting apart. We already have (several dozen) patches in production that
have not yet made it upstream, but are striving to keep this list as short
as possible to reduce the rebase pain and risk.b) Central Daemons (RM, or
pairs of HA NNs) have to be restarted causing a cluster-wide outage. The
work towards work-preserving restart in progress in various areas makes
these kinds of upgrades less painful.c) Server-side requires different
runtime from client-side. We'd have to produce multiple artifacts, but we
could make that work. For example, NN code uses Java 8 features, but
clients can still use Java 7 to submit jobs and read/write HDFS.Now for the
more painful backwards incompatibilities:d) All clients have to recompile
(a token uses protobuf instead of thrift, an interface becomes an abstract
class or vice versa). Not only do these kinds of changes make a rolling
upgrade impossible, more importantly it requires all our clients to
recompile their code and redeploy their production pipelines in a
coordinated fashion. On top of this, we have multiple large production
clusters and clients would have to keep multiple incompatible pipelines
running, because we simply cannot upgrade all clusters in all datacenters
at the same time.e) Customers are forced to restart and can no longer run
with JDK 7 clients because job submission client code or HDFS has started
using JDK 8-only features. Eventually group will reduce, but for at least
another year if not more this will be very painful.f) Even more painful is
when Yarn/MapReduce APIs change so that customers not only have to
recompile, but also have to change hundreds of scripts / flows in order to
deal with the API change. This problem is compounded by other tools in the
Hadoop ecosystem that would have to deal with these changes. There would be
two different versions of Cascading, HBase, Hive, Pig, Spark, Tez, you name
it.g) Without proper classpath isolation, third party dependency changes
(guava, protobuf version, etc) are probably as painful as API changes.h)
HDFS client API get changed in a backwards incompatible way requiring all
clients to change their code, recompile and re-start their services in a
coordinated way. We have tens of thousands of production servers reading
from / writing to Hadoop and cannot have all of these long running clients
restart at the same time. To put these in perspective, despite us being one
of the early adopters of Hadoop 2 in production at the scale of many
thousands of nodes, we are still wrapping up the migration from our last
Hadoop 1 clusters. We have many war stories about many of the above
incompatibilities. As I've tweeted about publicly the gains have been
significant with this migration to Hadoop 2, but the friction has also been
considerable.To get specific about JDK 8, we are intending to move to Java
8. Right now we're letting clients choose to run tasks with JDK 8
optionally, then we'll make it default.We'll switch to the daemons running
with JDK 8. What we're concerned it would then be feasible to use JDK 8
features on the servers side (see c) above).I'm suggesting that if we do
allow backwards incompatible changes, we introduce an upgrade path through
an agreed upon stepping stone release.For example, a protocol changing from
thrift to protobuf can be done in steps. In the stepping-stone release both
would be accepted. in the following release (or two releases later) the
thrift version support is dropped.This would allow for a rolling upgrade,
or even if a cluster-wide restart is needed, at least customers can adopt
to the change at a pace of weeks or months. Once no more (important)
customers are running the thrift client, we could then roll to the next
release.It would be useful to coordinate the backwards incompatibilities so
that not every release becomes a stepping-stone release.*




*Cheers,Joep*




On Mon, Mar 9, 2015 at 6:04 PM, Andrew Wang andrew.w...@cloudera.com
wrote:

 Hi Mayank,


  1. We would be moving to Hadoop -3 (Not this year though) however I don't
  see we can do another JDK upgrade so soon. So the point I am trying to
 make
  is we should be 

Re: symlink support in Hadoop 2 GA

2013-09-18 Thread J. Rottinghuis
However painful protobuf version changes are at build time for Hadoop
developers, at runtime with multiple clusters and many Hadoop users this is
a total nightmare.
Even upgrading clusters from one protobuf version to the next is going to
be very difficult. The same users will run jobs on, and/or readwrite to
multiple clusters. That means that they will have to fork their code, run
multiple instances? Or in the very least they have to do an update to their
applications. All in sync with Hadoop cluster changes. And these are not
doable in a rolling fashion.
All Hadoop and HBase clusters will all upgrade at the same time, or we'll
have to have our users fork / roll multiple versions ?
My point is that these things are much harder than just fix the (Jenkins)
build and we're done. These changes are massively disruptive.

There is a similar situation with symlinks. Having an API that lets users
create symlinks is very problematic. Some users create symlinks and as Eli
pointed out, somebody else (or automated process) tries to copy to / from
another (Hadoop 1.x?) cluster over hftp. What will happen ?
Having an API that people should not use is also a nightmare. We
experienced this with append. For a while it was there, but users were not
allowed to use it (or else there were large #'s of corrupt blocks). If
there is an API to create a symlink, then some of our users are going to
use it and others are going to trip over those symlinks. We already know
that Pig does not work with symlinks yet, and as Steve pointed out, there
is tons of other code out there that assumes that !isDir() means isFile().

I like symlink functionality, but in our migration to Hadoop 2.x this is a
total distraction. If the APIs stay in 2.2 GA we'll have to choose to:
a) Not uprev until symlink support is figured out up and down the stack,
and we've been able to migrate all our 1.x (equivalent) clusters to 2.x
(equivalent). Or
b) rip out the API altogether. Or
c) change the implementation to throw an UnsupportedOperationException
I'm not sure yet which of these I like least.

Thanks,

Joep




On Wed, Sep 18, 2013 at 9:48 AM, Arun C Murthy a...@hortonworks.com wrote:


 On Sep 16, 2013, at 6:49 PM, Andrew Wang andrew.w...@cloudera.com wrote:

  Hi all,
 
  I wanted to broadcast plans for putting the FileSystem symlinks work
  (HADOOP-8040) into branch-2.1 for the pending Hadoop 2 GA release. I
 think
  it's pretty important we get it in since it's not a compatible change; if
  it misses the GA train, we're not going to have symlinks until the next
  major release.

 Just catching up, is this an incompatible change, or not? The above reads
 'not an incompatible change'.

 Arun

 
  However, we're still dealing with ongoing issues revealed via testing.
  There's user-code out there that only handles files and directories and
  will barf when given a symlink (perhaps a dangling one!). See HADOOP-9912
  for a nice example where globStatus returning symlinks broke Pig; some of
  us had a conference call to talk it through, and one definite conclusion
  was that this wasn't solvable in a generally compatible manner.
 
  There are also still some gaps in symlink support right now. For example,
  the more esoteric FileSystems like WebHDFS, HttpFS, and HFTP need symlink
  resolution, and tooling like the FsShell and Distcp still need to be
  updated as well.
 
  So, there's definitely work to be done, but there are a lot of users
  interested in the feature, and symlinks really should be in GA. Would
  appreciate any thoughts/input on the matter.
 
  Thanks,
  Andrew

 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/



 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: [VOTE] Release Apache Hadoop 2.0.5-alpha

2013-05-31 Thread J. Rottinghuis
Thanks for the hadoop-2.0.5-alpha RC0 Cos!
+1 (non-binding) in principle for a 2.0.5-alpha release.

Similar to Alejandro I see that:

/hadoop-2.0.5-alpha-src/hadoop-hdfs-project/README.txt (has 2.0.4 release
date missing):
Release 2.0.5-alpha - UNRELEASED
Release 2.0.4-alpha - UNRELEASED

/hadoop-yarn-project/CHANGES.txt (misses 2.0.5 section and does not have
date for previous 2.0.4 release set):
Release 2.0.5-alpha (- Missing section)
Release 2.0.4-alpha - UNRELEASED

In addition I do not see the release notes at the top of the tar.gz:
releasenotes.2.0.5-alpha.html
releasenotes.HADOOP.2.0.5-alpha.html
releasenotes.HDFS.2.0.5-alpha.html
releasenotes.MAPREDUCE.2.0.5-alpha.html
releasenotes.YARN.2.0.5-alpha.html

I do see the src-tar.gz etc., but not the regular tarball in
http://people.apache.org/~cos/hadoop-2.0.5-alpha-rc0/

Not sure if that warrants a RC1 or not.

Thanks,

Joep

On Fri, May 31, 2013 at 5:27 PM, Alejandro Abdelnur t...@cloudera.comwrote:

 Verified MD5  signature, built, configured pseudo cluster, run a couple of
 sample jobs, tested HTTPFS.

 Still, something seems odd.

 The HDFS CHANGES.txt has the following entry under 2.0.5-alpha:

  HDFS-4646. createNNProxyWithClientProtocol ignores configured timeout
 value  (Jagane Sundar via cos)

 but I don't see that in the branch.

 And, the YARN CHANGES.txt does not have the 2.0.5-alpha section (it should
 be there empty).

 Cos, can you please look at these 2 things and explain/fix?

 Thanks.



 On Fri, May 31, 2013 at 4:04 PM, Konstantin Boudnik c...@apache.org
 wrote:

  All,
 
  I have created a release candidate (rc0) for hadoop-2.0.5-alpha that I
  would
  like to release.
 
  This is a stabilization release that includes fixed for a couple a of
  issues
  discovered in the testing with BigTop 0.6.0 release candidate.
 
  The RC is available at:
  http://people.apache.org/~cos/hadoop-2.0.5-alpha-rc0/
  The RC tag in svn is here:
 
 http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.0.5-alpha-rc0
 
  The maven artifacts will be available via repository.apache.org on Sat,
  June
  1st, 2013 at 2 pm PDT as outlined here
  http://s.apache.org/WKD
 
  Please try the release bits and vote; the vote will run for the 3 days,
  because this is just a version name change. The bits are identical to the
  ones
  voted on before in
  http://s.apache.org/2041move
 
  Thanks for your voting
Cos
 
 


 --
 Alejandro



Re: [VOTE] Release Apache Hadoop 2.0.5-alpha

2013-05-31 Thread J. Rottinghuis
Thanks for fixing Cos.
http://people.apache.org/~cos/hadoop-2.0.5-alpha-rc1/
looks good to me.
+1 (non-binding)

Thanks,

Joep


On Fri, May 31, 2013 at 8:25 PM, Konstantin Boudnik c...@apache.org wrote:

 Ok, WRT HDFS-4646 - it is all legit and the code is in branch-2.0.4-alpha
 and
 later. It has been committed as
   r1465124
 The reason it isn't normally visible because of the weird commit message:

 svn merge -c1465121 from trunk

 So, we good. I am done with the CHANGES.txt fixed that you guys have noted
 earlier and will be re-spinning RC1 in a few.

 Cos

 On Fri, May 31, 2013 at 08:07PM, Konstantin Boudnik wrote:
  Alejandro,
 
  thanks for looking into this. Indeed - I missed the 2.0.5-alpha section
 in
  YARN CHANGES.txt. Added now. As for HDFS-4646: apparently I didn't get
 into
  to branch-2.0.4-alpha back then, although I distinctively remember doing
 this.
  Let me pull it into 2.0.5-alpha and update CHANGES.txt to reflect it.
 Also, I
  will do JIRA in a moment.
 
  Joep, appreciate the thorough examination. I have fixed the dates for the
  releases 2.0.4-alpha. As for the top-level readme file - sorry I wasn't
 aware
  about them. As for the binary: I am pretty sure we are only releasing
 source
  code, but I will put binaries into the rc1 respin.
 
  I will respin rc1 shortly. Appreciate the feedback!
Cos
 
  On Fri, May 31, 2013 at 05:27PM, Alejandro Abdelnur wrote:
   Verified MD5  signature, built, configured pseudo cluster, run a
 couple of
   sample jobs, tested HTTPFS.
  
   Still, something seems odd.
  
   The HDFS CHANGES.txt has the following entry under 2.0.5-alpha:
  
HDFS-4646. createNNProxyWithClientProtocol ignores configured timeout
   value  (Jagane Sundar via cos)
  
   but I don't see that in the branch.
  
   And, the YARN CHANGES.txt does not have the 2.0.5-alpha section (it
 should
   be there empty).
  
   Cos, can you please look at these 2 things and explain/fix?
  
   Thanks.
  
  
  
   On Fri, May 31, 2013 at 4:04 PM, Konstantin Boudnik c...@apache.org
 wrote:
  
All,
   
I have created a release candidate (rc0) for hadoop-2.0.5-alpha that
 I
would
like to release.
   
This is a stabilization release that includes fixed for a couple a of
issues
discovered in the testing with BigTop 0.6.0 release candidate.
   
The RC is available at:
http://people.apache.org/~cos/hadoop-2.0.5-alpha-rc0/
The RC tag in svn is here:
   
 http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.0.5-alpha-rc0
   
The maven artifacts will be available via repository.apache.org on
 Sat,
June
1st, 2013 at 2 pm PDT as outlined here
http://s.apache.org/WKD
   
Please try the release bits and vote; the vote will run for the 3
 days,
because this is just a version name change. The bits are identical
 to the
ones
voted on before in
http://s.apache.org/2041move
   
Thanks for your voting
  Cos
   
   
  
  
   --
   Alejandro





Re: [VOTE] Release Apache Hadoop 2.0.5-alpha (rc1)

2013-05-31 Thread J. Rottinghuis
+1 (non-binding)

Joep


On Fri, May 31, 2013 at 9:27 PM, Konstantin Boudnik c...@apache.org wrote:

 All,

 I have created a release candidate (rc1) for hadoop-2.0.5-alpha that I
 would
 like to release.

 This is a stabilization release that includes fixed for a couple a of
 issues
 discovered in the testing with BigTop 0.6.0 release candidate.

 The RC is available at:
 http://people.apache.org/~cos/hadoop-2.0.5-alpha-rc1/
 The RC tag in svn is here:
 http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.0.5-alpha-rc1

 The maven artifacts will be available via repository.apache.org on Sat,
 June
 1st, 2013 at 2 pm PDT as outlined here
 http://s.apache.org/WKD

 Please try the release bits and vote; the vote will run for the 3 days,
 because this is just a version name change. The bits are identical to the
 ones
 voted on before in
 http://s.apache.org/2041move

 Thanks for your voting
   Cos