Thanks John, another one for my list of things to look at.

On Mon, Aug 7, 2017 at 11:24 PM, John Zhuge <john.zh...@gmail.com> wrote:

> And check out HADOOP-12145
> <https://issues.apache.org/jira/browse/HADOOP-12145> Organize and update
> CodeReviewChecklist wiki.
>
> Thanks, your contribution will be greatly appreciated!
>
>
> On Mon, Aug 7, 2017 at 5:53 AM, Steve Loughran <ste...@hortonworks.com>
> wrote:
>
>>
>> Hi Lars & Welcome!
>>
>> Maybe the first step here would be look at those style guides and think
>> how to bring them up to date, especially with stuff like lambda-expressions
>> in java 8, and mnodules forthcoming in in java 9, SLF4J logging, Junit 5 ->
>> 5 testing, code instrumentation, diagnostics, log stability, etc.
>>
>> https://issues.apache.org/jira/browse/HADOOP-12143 . ;
>>
>> This is my go at doing this
>>
>> https://github.com/steveloughran/formality/blob/master/
>> styleguide/styleguide.md
>>
>>
>> I've not done any work on trying to get it in, more evolving it as how I
>> code & what I look for, especially in tests.
>>
>> If you want to take this on, it'd be nice. At the same time, I fear
>> there'd be push back if you turned up and started telling people what to
>> do. Collaborating with us all on the test code is a good place to start.
>>
>> We're also more relaxed about contributions to the less-core bits of the
>> system (things like HDFS, IPC, security and Yarn core are trouble). If
>> there's stuff outside that you want to take a go at helping clean up,
>> that'd be lower risk (example: object store connectors)
>>
>> -Steve
>>
>>
>>
>> On 7 Aug 2017, at 13:13, Lars Francke <lars.fran...@gmail.com<mailto:
>> lars.fran...@gmail.com>> wrote:
>>
>> Hi,
>>
>> a few words about me: I've contributed to Hadoop (and it's ecosystem[4])
>> in
>> the past am a Hive committer and have used Hadoop for 10 years now, so I'm
>> not totally inexperienced. I'm earning my money as a Hadoop consultant so
>> I've seen dozens of real-life clusters in my life.
>>
>> As part of a few recent client projects and now writing about Hadoop in a
>> new project/book I'm digging into the source code to figure out some of
>> the
>> things that are not documented.
>>
>> But as part of this digging I'm seeing lots of warnings in the code,
>> inconsistencies etc. and I'd like to contribute some fixes to this back to
>> the community.
>>
>> I have been a long-time believer in good code quality and consistent code
>> styles. This might affect people like me especially who do a lot of
>> "drive-by" contributions as I'm not someone who looks at the code daily
>> but
>> comes across it reasonably often as part of client engagements. In those
>> scenarios, it's very unhelpful to have inconsistent code & bad
>> documentation.
>>
>> Two simple but concrete examples:
>> * There's lots of "final" usages on variables and methods but no
>> consistency. Was this done for particular reasons or personal preference?
>>
>> personal, though with a move to l-expressions, it matters a lot more. We
>> should really be marking all parameters as final at the very least.
>>
>>
>> * Similarly, there's lots of things that are public or protected while
>> they
>> could in theory be private. This especially makes it very hard to reason
>> about code.
>>
>> there's now a bit of fear of breaking things, but at the very least,
>> things could be protected or package-private more than they are.
>>
>>
>>
>> Judging from the current code there's lots of "unofficial" code styling
>> and/or personal preference. The Wiki says[1] to follow the Sun
>> guidelines[2] which have not been updated in almost 20 years. A new
>> version
>> is in the works an clarifies a lot of things[3]. I'm trying to get it
>> published soon. I'd try to format according to the latter (that means
>> among
>> other things no "final" for local variables).
>>
>> I realize that I won't be able to single-handedly fix all of this
>> especially as code gets contributed but if the community thinks it's
>> worthwhile I'd still love to land a few cleanup patches. My experience in
>> the past has been that it's hard to get attention to these things (which I
>> fully understand as they take up someone's time to review & commit).
>>
>> So, this is my request for comments on these questions:
>> * Is there any interest in this at all?
>> ** "This" being patches for code style & things like FindBugs & Checkstyle
>> warnings
>> * Size of the patches: Rather one big patch or smaller ones (e.g. per file
>> or package)
>> * Anyone willing to help me with this? e.g. reviewing and committing? I'd
>> be more than happy to bribe you with drinks, sweets, food or something
>> else
>>
>> My plan is not to go through each and every file and fix every issue I
>> see.
>> But there are some specific areas I'm looking at in detail and there I'd
>> love to contribute back.
>>
>> Thank you for reading!
>>
>> Cheers,
>> Lars
>>
>> PS: Posting to common-dev only, not sure if I should cross post to
>> hdfs-dev
>> and yarn-dev as well?
>>
>> [1] <https://wiki.apache.org/hadoop/CodeReviewChecklist>
>> [2] <
>> http://www.oracle.com/technetwork/java/javase/documentation/
>> codeconvtoc-136057.html
>>
>> [3] <http://cr.openjdk.java.net/~alundblad/styleguide/index-v6.html>
>> [4] <
>> https://issues.apache.org/jira/issues/?filter=-1&jql=reporte
>> r%20%3D%20lars_francke%20OR%20assignee%20%3D%20lars_francke
>>
>>
>>
>
>
> --
> John
>

Reply via email to