Re: When are incompatible changes acceptable (HDFS-12990)

Eric Yang Wed, 10 Jan 2018 09:55:49 -0800

See comments inline.

Regards,
Eric

From: <a...@cloudera.com> on behalf of "Aaron T. Myers" <a...@apache.org>
Date: Wednesday, January 10, 2018 at 9:21 AM
To: Eric Yang <ey...@hortonworks.com>
Cc: Chris Douglas <cdoug...@apache.org>, larry mccay <lmc...@apache.org>, 
Hadoop Common <common-dev@hadoop.apache.org>
Subject: Re: When are incompatible changes acceptable (HDFS-12990)

Hey Eric,

Comments inline.

On Wed, Jan 10, 2018 at 9:06 AM, Eric Yang 
<ey...@hortonworks.com<mailto:ey...@hortonworks.com>> wrote:
Hi Aaron,

Correct me if I am wrong, the port change is only required when making a new 
cluster due to the default value.  Existing cluster does not need to make the 
switch, if Hadoop configuration contains user defined number.

Certainly true that a port change isn't required, and if it's already properly 
being set everywhere throughout a deployment (i.e. all clients, client 
applications, scripts, etc.) it won't be an issue. I'm most worried about 
*client* configs throughout a large deployment, which would be difficult 
(impossible?) to coordinate an update to. Entirely possible, if not likely, 
that many clients are inadvertently relying on the default port, so when they 
start using the updated software they'll break because of the default port 
change.

Ambari, and Cloudera Manager are already handling user defined ports correctly. 
 Some QA tools may need to change, but it is a good exercise to run on 
non-standard port.

Sites which are using Ambari or Cloudera Manager are more likely to work, but 
again, I worry about client configs and other places that might have hard-coded 
the port number, e.g. in Hive or in scripts.

I will also say that Hadoop users which are *not* using Ambari or CM should be 
considered as well. Sites like this are perhaps the most likely to break 
because of this change.

Agree.
I gave my vote to keep the setting, and fully respect the community’s decision 
in this matter.

Thanks, Eric. I understand your argument to be that changing this default port 
might not be so bad, but it also sounds like you wouldn't object if others 
conclude that it's best to change it back. Is that right?

The decision is in the hands of Apache Hadoop community.  This is not a 
decision that can be made by one individual, one company or another.  Let’s 
start a voting thread to make sure that the decision was made by Hadoop 
community correctly.

Best,
Aaron

Regards,
Eric

From: <a...@cloudera.com<mailto:a...@cloudera.com>> on behalf of "Aaron T. 
Myers" <a...@apache.org<mailto:a...@apache.org>>
Date: Tuesday, January 9, 2018 at 9:22 PM
To: Eric Yang <ey...@hortonworks.com<mailto:ey...@hortonworks.com>>
Cc: Chris Douglas <cdoug...@apache.org<mailto:cdoug...@apache.org>>, larry 
mccay <lmc...@apache.org<mailto:lmc...@apache.org>>, Hadoop Common 
<common-dev@hadoop.apache.org<mailto:common-dev@hadoop.apache.org>>
Subject: Re: When are incompatible changes acceptable (HDFS-12990)

On Tue, Jan 9, 2018 at 3:15 PM, Eric Yang 
<ey...@hortonworks.com<mailto:ey...@hortonworks.com>> wrote:
While I agree the original port change was unnecessary, I don’t think Hadoop NN 
port change is a bad thing.

I worked for a Hadoop distro that NN RPC port was default to port 9000.  When 
we migrate from BigInsights to IOP and now to HDP, we have to move customer 
Hive metadata to new NN RPC port.  It only took one developer (myself) to write 
the tool for the migration.  The incurring workload is not as bad as most 
people anticipated because Hadoop depends on configuration file for referencing 
namenode.  Most of the code can work transparently.  It helped to harden the 
downstream testing tools to be more robust.

While there are of course ways to deal with this, the question really should be 
whether or not it's a desirable thing to do to our users.

We will never know how many people are actively working on Hadoop 3.0.0.  
Perhaps, couple hundred developers or thousands.

You're right that we can't know for sure, but I strongly suspect that this is a 
substantial overestimate. Given how conservative Hadoop operators tend to be, I 
view it as exceptionally unlikely that many deployments have been created on or 
upgraded to Hadoop 3.0.0 since it was released less than a month ago.

Further, I hope you'll agree that the number of 
users/developers/deployments/applications which are currently on Hadoop 2.x is 
*vastly* greater than anyone who might have jumped on Hadoop 3.0.0 so quickly. 
When all of those users upgrade to any 3.x version, they will encounter this 
needless incompatible change and be forced to work around it.

I think the switch back may have saved few developers work, but there could be 
more people getting impacted at unexpected minor release change in the future.  
I recommend keeping current values to avoid rule bending and future 
frustrations.

That we allow this incompatible change now does not mean that we are 
categorically allowing more incompatible changes in the future. My point is 
that we should in all instances evaluate the merit of any incompatible change 
on a case-by-case basis. This is not an exceptional circumstance - we've made 
incompatible changes in the past when appropriate, e.g. breaking some clients 
to address a security issue. I and others believe that in this case the 
benefits greatly outweigh the downsides of changing this back to what it has 
always been.

Best,
Aaron

Regards,
Eric

On 1/9/18, 11:21 AM, "Chris Douglas" 
<cdoug...@apache.org<mailto:cdoug...@apache.org>> wrote:

    Particularly since 9820 isn't in the contiguous range of ports in
    HDFS-9427, is there any value in this change?

    Let's change it back to prevent the disruption to users, but
    downstream projects should treat this as a bug in their tests. Please
    open JIRAs in affected projects. -C

    On Tue, Jan 9, 2018 at 5:18 AM, larry mccay 
<lmc...@apache.org<mailto:lmc...@apache.org>> wrote:
    > On Mon, Jan 8, 2018 at 11:28 PM, Aaron T. Myers 
<a...@apache.org<mailto:a...@apache.org>> wrote:
    >
    >> Thanks a lot for the response, Larry. Comments inline.
    >>
    >> On Mon, Jan 8, 2018 at 6:44 PM, larry mccay 
<lmc...@apache.org<mailto:lmc...@apache.org>> wrote:
    >>
    >>> Question...
    >>>
    >>> Can this be addressed in some way during or before upgrade that allows 
it
    >>> to only affect new installs?
    >>> Even a config based workaround prior to upgrade might make this a change
    >>> less disruptive.
    >>>
    >>> If part of the upgrade process includes a step (maybe even a script) to
    >>> set the NN RPC port explicitly beforehand then it would allow existing
    >>> deployments and related clients to remain whole - otherwise it will 
uptake
    >>> the new default port.
    >>>
    >>
    >> Perhaps something like this could be done, but I think there are 
downsides
    >> to anything like this. For example, I'm sure there are plenty of
    >> applications written on top of Hadoop that have tests which hard-code the
    >> port number. Nothing we do in a setup script will help here. If we don't
    >> change the default port back to what it was, these tests will likely all
    >> have to be updated.
    >>
    >>
    >
    > I may not have made my point clear enough.
    > What I meant to say is to fix the default port but direct folks to
    > explicitly set the port they are using in a deployment (the current
    > default) so that it doesn't change out from under them - unless they are
    > fine with it changing.
    >
    >
    >>
    >>> Meta note: we shouldn't be so pedantic about policy that we can't back
    >>> out something that is considered a bug or even mistake.
    >>>
    >>
    >> This is my bigger point. Rigidly adhering to the compat guidelines in 
this
    >> instance helps almost no one, while hurting many folks.
    >>
    >> We basically made a mistake when we decided to change the default NN port
    >> with little upside, even between major versions. We discovered this very
    >> quickly, and we have an opportunity to fix it now and in so doing likely
    >> disrupt very, very few users and downstream applications. If we don't
    >> change it, we'll be causing difficulty for our users, downstream
    >> developers, and ourselves, potentially for years.
    >>
    >
    > Agreed.
    >
    >
    >>
    >> Best,
    >> Aaron
    >>

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: 
common-dev-unsubscr...@hadoop.apache.org<mailto:common-dev-unsubscr...@hadoop.apache.org>
    For additional commands, e-mail: 
common-dev-h...@hadoop.apache.org<mailto:common-dev-h...@hadoop.apache.org>

Re: When are incompatible changes acceptable (HDFS-12990)

Reply via email to