+1 for Steve's and Chris's sentiments. Mass reformatting of existing code can
make maintaining anything released prior to the makeover very difficult.
Almost all of Apache Hadoop's users are not on trunk or branch-2, and I'm not
sure we want large refactoring patches going into stability lines like
branch-2.8, branch-2.7, and branch-2.6 where most of the users are. We should
definitely consider the maintenance costs of refactoring decisions.
Jason
On Wednesday, August 9, 2017 4:55 AM, Steve Loughran
<[email protected]> wrote:
> On 8 Aug 2017, at 21:33, Chris Douglas <[email protected]> wrote:
>
> Lars-
>
> Welcome!
>
> As a mild refinement of enthusiasm for this proposal: when you
> approach a "cleanup", please consider the cost to tracing the lineage
> of changes in the codebase. Working on a project as large and
> long-running as Hadoop, we often need to trace what motivated a
> particular change using only the commit log and JIRA. Sifting through
> cosmetic changes that obscure the reasoning behind a module not worth
> the aesthetic benefits of consistently formatted code. As a strawman:
> hitting 100% checkstyle compliance would not improve our users'
> experience, so please use your judgement.
>
> As you point out, we're not going to maintain perfect discipline going
> forward, either. Nitpicking our contributors beyond what is necessary
> to keep the code legible discourages them from continuing as
> contributors. As a general heuristic: the stricter the rule, the more
> automation is required to enforce it. This prevents everyone from
> burning out on minutiae.
>
> All that said, if you propose a refactoring that makes it easier to
> maintain code that's developed more vestigial parts that functional
> ones (and we have more than a few of those), that is hugely valuable.
> -C
That reminds me of a few more issues
* Major cleanup patches invariably break handling pending patches from others
(which we should review) and also make cherrypicking harder. Which we why we
tend to avoid things like a "lets fix the import order" patch for the sake of
it.
* We can't treat things tagged @Private as stuff we can break on a whim. I know
it'd be nice, but often they get picked up because they're the only way to do
things...even the example YARN app does this. So changes there always need to
go through a scan of the downstream apps. A few of us have IntelliJ set up to
include all the main projects so we can find if/where a class or method gets
used...and use that to temper our enthusiasm. Chris himself had to deal with
this last week with the proposed removal of FileStatus.isDir in HADOOP-14726 .
We never want to break downstream code
FWIW, I'd use any new styleguide to manage future contribs, not reapply to all
existing code, except during other work. Even then, with caution
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]