This is an automated email from the ASF dual-hosted git repository.
vinoth pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 8fe9d94 Travis CI build asf-site
8fe9d94 is described below
commit 8fe9d947796418a3e0d2614d733e698b76432ded
Author: CI <[email protected]>
AuthorDate: Thu Sep 3 20:05:58 2020 +0000
Travis CI build asf-site
---
content/contributing.html | 97 ++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 92 insertions(+), 5 deletions(-)
diff --git a/content/contributing.html b/content/contributing.html
index 15ecc845..73fa4bc 100644
--- a/content/contributing.html
+++ b/content/contributing.html
@@ -201,6 +201,7 @@
<li><a href="#filing-jiras">Filing JIRAs</a></li>
<li><a href="#claiming-jiras">Claiming JIRAs</a></li>
<li><a href="#contributing-code">Contributing Code</a></li>
+ <li><a href="#coding-guidelines">Coding guidelines</a></li>
<li><a href="#reviewing-coderfcs">Reviewing Code/RFCs</a></li>
<li><a href="#suggest-changes">Suggest Changes</a></li>
</ul>
@@ -361,13 +362,9 @@ so that the community can contribute at large and help
implement it much quickly
<li>[Optional] Familiarize yourself with internals of Hudi using content on
this page, as well as <a
href="https://cwiki.apache.org/confluence/display/HUDI">wiki</a></li>
<li>Make your code change
<ul>
- <li>Every source file needs to include the Apache license header. Every
new dependency needs to have an
-open source license <a
href="https://www.apache.org/legal/resolved.html#criteria">compatible</a> with
Apache.</li>
- <li>If you are re-using code from another apache/open-source project,
licensing needs to be compatible and attribution added to <code
class="highlighter-rouge">LICENSE</code> file</li>
- <li>Please DO NOT copy paste any code from StackOverflow or other online
sources, since their license attribution would be unclear. Author them
yourself!</li>
<li>Get existing tests to pass using <code class="highlighter-rouge">mvn
clean install -DskipITs</code></li>
<li>Add adequate tests for your new functionality</li>
- <li>For involved changes, it’s best to also run the entire integration
test suite using <code class="highlighter-rouge">mvn clean install</code></li>
+ <li>For involved changes, it’s best to test the changes in real
production environments and report the results in the PR.</li>
<li>For website changes, please build the site locally & test
navigation, formatting & links thoroughly</li>
<li>If your code change changes some aspect of documentation (e.g new
config, default value change),
please ensure there is another PR to <a
href="https://github.com/apache/hudi/tree/asf-site/README.md">update the
docs</a> as well.</li>
@@ -387,6 +384,92 @@ where you replace <code
class="highlighter-rouge">HUDI-XXX</code> with the appro
<li>Finally, once your pull request is merged, make sure to <code
class="highlighter-rouge">Close</code> the JIRA.</li>
</ul>
+<h3 id="coding-guidelines">Coding guidelines</h3>
+
+<p>Our code can benefit from contributors speaking the same “language” when
authoring code. After all, it gets read a lot more than it gets
+written. So optimizing for “reads” is a good goal. The list below is a set of
guidelines, that contributors strive to upkeep and reflective
+of how we want to evolve our code in the future.</p>
+
+<h4 id="style">Style</h4>
+
+<ul>
+ <li><strong>Formatting</strong> We should rely on checkstyle and spotless to
auto fix formatting; automate this completely. Where we cannot,
+ we will err on the side of not taxing contributors with manual effort.</li>
+ <li><strong>Refactoring</strong>
+ <ul>
+ <li>Refactor with purpose; any refactor suggested should be attributable
to functionality that now becomes easy to implement.</li>
+ <li>A class is asking to be refactored, when it has several overloaded
responsibilities/have sets of fields/methods which are used more cohesively
than others.</li>
+ <li>Try to name tests using the given-when-then model, that cleans
separates preconditions (given), an action (when), and assertions (then).</li>
+ </ul>
+ </li>
+ <li><strong>Naming things</strong>
+ <ul>
+ <li>Let’s name uniformly; using the same word to denote the same
concept. e.g: bootstrap vs external vs source, when referring to bootstrapped
tables.
+Maybe they all mean the same, but having one word makes the code lot more
easily readable.</li>
+ <li>Let’s name consistently with Hudi terminology. e.g dataset vs table,
base file vs data file.</li>
+ <li>Class names preferably are nouns (e.g Runner) which reflect their
responsibility and methods are verbs (e.g run()).</li>
+ <li>Avoid filler words, that don’t add value e.g xxxInfo, xxxData,
etc.</li>
+ <li>We name classes in code starting with <code
class="highlighter-rouge">Hoodie</code> and not <code
class="highlighter-rouge">Hudi</code> and we want to keep it that way for
consistency/historical reasons.</li>
+ </ul>
+ </li>
+ <li><strong>Methods</strong>
+ <ul>
+ <li>Individual methods should short (~20-30 lines) and have a single
purpose; If you feel like it has a secondary purpose, then maybe it needs
+to be broken down more.</li>
+ <li>Lesser the number of arguments, the better;</li>
+ <li>Place caller methods on top of callee methods, whenever
possible.</li>
+ <li>Avoid “output” arguments e.g passing in a list and filling its
values within the method.</li>
+ <li>Try to limit individual if/else blocks to few lines to aid
readability.</li>
+ <li>Separate logical blocks of code with a newline in between e.g read a
file into memory, loop over the lines.</li>
+ </ul>
+ </li>
+ <li><strong>Classes</strong>
+ <ul>
+ <li>Like method, each Class should have a single
purpose/responsibility.</li>
+ <li>Try to keep class files to about 200 lines of length, nothing beyond
500.</li>
+ <li>Avoid stating the obvious in comments; e.g each line does not
deserve a comment; Document corner-cases/special perf considerations etc
clearly.</li>
+ <li>Try creating factory methods/builders and interfaces wherever you
feel a specific implementation may be changed down the line.</li>
+ </ul>
+ </li>
+</ul>
+
+<h4 id="substance">Substance</h4>
+
+<ul>
+ <li>Try to avoid large PRs; if unavoidable (many times they are) please
separate refactoring with the actual implementation of functionality.
+e.g renaming/breaking up a file and then changing code changes, makes the diff
very hard to review.</li>
+ <li><strong>Licensing</strong>
+ <ul>
+ <li>Every source file needs to include the Apache license header. Every
new dependency needs to have
+an open source license <a
href="https://www.apache.org/legal/resolved.html#criteria">compatible</a> with
Apache.</li>
+ <li>If you are re-using code from another apache/open-source project,
licensing needs to be compatible and attribution added to <code
class="highlighter-rouge">LICENSE</code> file</li>
+ <li>Please DO NOT copy paste any code from StackOverflow or other online
sources, since their license attribution would be unclear. Author them
yourself!</li>
+ </ul>
+ </li>
+ <li><strong>Code Organization</strong>
+ <ul>
+ <li>Anything in <code class="highlighter-rouge">hudi-common</code>
cannot depend on a specific engine runtime like Spark.</li>
+ <li>Any changes to bundles under <code
class="highlighter-rouge">packaging</code>, will be reviewed with additional
scrutiny to avoid breakages across versions.</li>
+ </ul>
+ </li>
+ <li><strong>Code reuse</strong>
+ <ul>
+ <li>Whenever you can, please use/enhance use existing utils classes in
code (<code class="highlighter-rouge">CollectionUtils</code>, <code
class="highlighter-rouge">ParquetUtils</code>, <code
class="highlighter-rouge">HoodieAvroUtils</code>). Search for classes ending in
<code class="highlighter-rouge">Utils</code>.</li>
+ <li>As a complex project, that must integrate with multiple systems, we
tend to avoid dependencies like <code class="highlighter-rouge">guava</code>,
<code class="highlighter-rouge">apache commons</code> for the sake of easy
integration.
+ Please start a discussion on the mailing list, before attempting to
reintroduce them</li>
+ <li>As a data system, that takes performance seriously, we also write
pieces of infrastructure (e.g <code
class="highlighter-rouge">ExternalSpillableMap</code>) natively, that are
optimized specifically for our scenarios.
+ Please start with them first, when solving problems.</li>
+ </ul>
+ </li>
+ <li><strong>Breaking changes</strong>
+ <ul>
+ <li>Any version changes for dependencies, needs to be ideally vetted
across different user environments in the community, to get enough confidence
before merging.</li>
+ <li>Any changes to methods annotated with <code
class="highlighter-rouge">PublicAPIMethod</code> or classes annotated with
<code class="highlighter-rouge">PublicAPIClass</code> require upfront
discussion and potentially an RFC.</li>
+ <li>Any non-backwards compatible changes similarly need upfront
discussion and the functionality needs to implement an upgrade-downgrade
path.</li>
+ </ul>
+ </li>
+</ul>
+
<h3 id="reviewing-coderfcs">Reviewing Code/RFCs</h3>
<ul>
@@ -401,6 +484,10 @@ where you replace <code
class="highlighter-rouge">HUDI-XXX</code> with the appro
<li>Staying humble and eager to learn, goes a long way in ensuring these
reviews are smooth.</li>
</ul>
</li>
+ <li>Reviewers are expected to uphold the code quality, standards outlined
above.</li>
+ <li>When merging PRs, always make sure you are squashing the commits using
the “Squash and Merge” feature in Github</li>
+ <li>When necessary/appropriate, reviewers could make changes themselves to
PR branches, with the intent to get the PR landed sooner. (see <a
href="https://cwiki.apache.org/confluence/display/HUDI/Resources#Resources-PushingChangesToPRs">how-to</a>)
+Reviewers should seek explicit approval from author, before making large
changes to the original PR.</li>
</ul>
<h3 id="suggest-changes">Suggest Changes</h3>