[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7520: ARROW-9189: [Website] Improve contributor guide

2020-06-23 Thread GitBox


jorisvandenbossche commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444295036



##
File path: docs/source/developers/contributing.rst
##
@@ -124,29 +181,72 @@ To contribute a patch:
   `ARROW-767: [C++] Filesystem abstraction 
`_).
 * Make sure that your code passes the unit tests. You can find instructions how
   to run the unit tests for each Arrow component in its respective README file.
+
+Core developers and others with a stake in the part of the project your change
+affects will review, request changes, and hopefully indicate their approval
+in the end. To make the review process smooth for everyone, try to
+
+* Break your work into small, single-purpose patches if possible. It’s much
+  harder to merge in a large change with a lot of disjoint features, and
+  particularly if you're new to the project, smaller changes are much easier
+  for maintainers to accept.
 * Add new unit tests for your code.
+* Follow the style guides for the part(s) of the project you're modifying.
+  Some languages (C++, Python, and Rust, for example) run a lint check in
+  continuous integration. For all languages, see their respective developer
+  documentation and READMEs for style guidance. In general, try to make it look
+  as if the codebase has a single author, and emulate any conventions you see,
+  whether or not they are officially documented or checked.
+
+When tests are passing and the pull request has been approved by the interested
+parties, a committer will merge the pull request. This is done with a
+command-line utility that does a squash merge, so all of your commits will be
+registered as a single commit to the master branch; this simplifies the
+connection between JIRA issues and commits, and it makes it easier to bisect
+history to identify where changes were introduced. A side effect of this way of
+merging is that your pull request will appear in the GitHub interface to have
+been "closed without merge". Do not be alarmed: if you look at the bottom, you
+will see a message that says "@user closed this in $COMMIT".
+
+Local git conventions
++
+
+If you are tracking the Arrow source repository locally, here are some tips
+for using ``git``.
+
+All Arrow contributors work off of their personal fork of ``apache/arrow``
+and submit pull requests "upstream". Once you've cloned your fork of Arrow,
+be sure to::
+
+$ git remote add upstream https://github.com/apache/arrow
+
+to set the "upstream" repository.
+
+You are encouraged to develop on branches, rather than your own "master" 
branch,
+and it helps to keep your fork's master branch synced with ``upstream/master``.
 
-Thank you in advance for your contributions!
+To start a new branch, pull the latest from upstream first::
 
-Common Git conventions followed within the project
---
+   $ git fetch upstream
+   $ git checkout master
+   $ git reset --hard upstream/master
+   $ git checkout -b $NEW_BRANCH_NAME
 
-If you are tracking the Arrow source repository locally, following some common 
Git
-conventions would make everyone's workflow compatible.  These recommendations 
along with
-their rationale are outlined below.
+It does not matter what you call your branch. Some people like to use the JIRA
+number as branch name, others use descriptive names.
 
-It is strongly discouraged to use a regular ``git merge``, as a linear commit 
history is
-prefered by the project.  It is much easier to maintain, and makes for easier
-``cherry-picking`` of features; useful for backporting fixes to maintenance 
releases.
+Once you have a branch going, you should sync with ``upstream/master``
+regularly, as many commits merge to master every day.
+It is recommended to use ``git rebase`` rather than ``git merge``.

Review comment:
   Yes, it's partly a lack of GitHub's reviewing functionalities, but 
*when* using GitHub (which is the case right now), my feeling is that a merging 
workflow works better with it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7520: ARROW-9189: [Website] Improve contributor guide

2020-06-23 Thread GitBox


jorisvandenbossche commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444293158



##
File path: docs/source/developers/contributing.rst
##
@@ -124,29 +181,72 @@ To contribute a patch:
   `ARROW-767: [C++] Filesystem abstraction 
`_).
 * Make sure that your code passes the unit tests. You can find instructions how
   to run the unit tests for each Arrow component in its respective README file.
+
+Core developers and others with a stake in the part of the project your change
+affects will review, request changes, and hopefully indicate their approval
+in the end. To make the review process smooth for everyone, try to
+
+* Break your work into small, single-purpose patches if possible. It’s much
+  harder to merge in a large change with a lot of disjoint features, and
+  particularly if you're new to the project, smaller changes are much easier
+  for maintainers to accept.
 * Add new unit tests for your code.
+* Follow the style guides for the part(s) of the project you're modifying.
+  Some languages (C++, Python, and Rust, for example) run a lint check in
+  continuous integration. For all languages, see their respective developer
+  documentation and READMEs for style guidance. In general, try to make it look
+  as if the codebase has a single author, and emulate any conventions you see,
+  whether or not they are officially documented or checked.
+
+When tests are passing and the pull request has been approved by the interested
+parties, a committer will merge the pull request. This is done with a
+command-line utility that does a squash merge, so all of your commits will be
+registered as a single commit to the master branch; this simplifies the
+connection between JIRA issues and commits, and it makes it easier to bisect
+history to identify where changes were introduced. A side effect of this way of
+merging is that your pull request will appear in the GitHub interface to have
+been "closed without merge". Do not be alarmed: if you look at the bottom, you
+will see a message that says "@user closed this in $COMMIT".
+
+Local git conventions
++
+
+If you are tracking the Arrow source repository locally, here are some tips
+for using ``git``.
+
+All Arrow contributors work off of their personal fork of ``apache/arrow``
+and submit pull requests "upstream". Once you've cloned your fork of Arrow,
+be sure to::
+
+$ git remote add upstream https://github.com/apache/arrow
+
+to set the "upstream" repository.
+
+You are encouraged to develop on branches, rather than your own "master" 
branch,
+and it helps to keep your fork's master branch synced with ``upstream/master``.
 
-Thank you in advance for your contributions!
+To start a new branch, pull the latest from upstream first::
 
-Common Git conventions followed within the project
---
+   $ git fetch upstream
+   $ git checkout master
+   $ git reset --hard upstream/master
+   $ git checkout -b $NEW_BRANCH_NAME
 
-If you are tracking the Arrow source repository locally, following some common 
Git
-conventions would make everyone's workflow compatible.  These recommendations 
along with
-their rationale are outlined below.
+It does not matter what you call your branch. Some people like to use the JIRA
+number as branch name, others use descriptive names.
 
-It is strongly discouraged to use a regular ``git merge``, as a linear commit 
history is
-prefered by the project.  It is much easier to maintain, and makes for easier
-``cherry-picking`` of features; useful for backporting fixes to maintenance 
releases.
+Once you have a branch going, you should sync with ``upstream/master``
+regularly, as many commits merge to master every day.
+It is recommended to use ``git rebase`` rather than ``git merge``.

Review comment:
   > Resolving conflicts in a PR with merge commits can be a nightmare
   
   If you only `merge upstream/master`, then that is in my experience never a 
nightmare (you should of course never mix rebasing and merging, as that indeed 
will give nightmares). 
   
   Also, when merging master instead of rebasing, you only need to fix merge 
conflicts once. While now we have here a complicated section about how to 
simplify the conflict resolution by squashing your commits while rebasing. This 
is never needed in a merging workflow (which IMO is also nicer because it 
preserves that you can see what has been changed since a last review on github).
   
   Anyway, we are not going to resolve that discussion here ;) (and both 
workflows have its pros/cons). I was mainly wondering to what extent we want to 
push contributors to a chosen workflow.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific 

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7520: ARROW-9189: [Website] Improve contributor guide

2020-06-23 Thread GitBox


jorisvandenbossche commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444268553



##
File path: docs/source/developers/contributing.rst
##
@@ -124,29 +181,72 @@ To contribute a patch:
   `ARROW-767: [C++] Filesystem abstraction 
`_).
 * Make sure that your code passes the unit tests. You can find instructions how
   to run the unit tests for each Arrow component in its respective README file.
+
+Core developers and others with a stake in the part of the project your change
+affects will review, request changes, and hopefully indicate their approval
+in the end. To make the review process smooth for everyone, try to
+
+* Break your work into small, single-purpose patches if possible. It’s much
+  harder to merge in a large change with a lot of disjoint features, and
+  particularly if you're new to the project, smaller changes are much easier
+  for maintainers to accept.
 * Add new unit tests for your code.
+* Follow the style guides for the part(s) of the project you're modifying.
+  Some languages (C++, Python, and Rust, for example) run a lint check in
+  continuous integration. For all languages, see their respective developer
+  documentation and READMEs for style guidance. In general, try to make it look
+  as if the codebase has a single author, and emulate any conventions you see,
+  whether or not they are officially documented or checked.
+
+When tests are passing and the pull request has been approved by the interested
+parties, a committer will merge the pull request. This is done with a
+command-line utility that does a squash merge, so all of your commits will be
+registered as a single commit to the master branch; this simplifies the
+connection between JIRA issues and commits, and it makes it easier to bisect
+history to identify where changes were introduced. A side effect of this way of
+merging is that your pull request will appear in the GitHub interface to have
+been "closed without merge". Do not be alarmed: if you look at the bottom, you
+will see a message that says "@user closed this in $COMMIT".
+
+Local git conventions
++
+
+If you are tracking the Arrow source repository locally, here are some tips
+for using ``git``.
+
+All Arrow contributors work off of their personal fork of ``apache/arrow``
+and submit pull requests "upstream". Once you've cloned your fork of Arrow,
+be sure to::
+
+$ git remote add upstream https://github.com/apache/arrow
+
+to set the "upstream" repository.
+
+You are encouraged to develop on branches, rather than your own "master" 
branch,
+and it helps to keep your fork's master branch synced with ``upstream/master``.
 
-Thank you in advance for your contributions!
+To start a new branch, pull the latest from upstream first::
 
-Common Git conventions followed within the project
---
+   $ git fetch upstream
+   $ git checkout master
+   $ git reset --hard upstream/master
+   $ git checkout -b $NEW_BRANCH_NAME
 
-If you are tracking the Arrow source repository locally, following some common 
Git
-conventions would make everyone's workflow compatible.  These recommendations 
along with
-their rationale are outlined below.
+It does not matter what you call your branch. Some people like to use the JIRA
+number as branch name, others use descriptive names.
 
-It is strongly discouraged to use a regular ``git merge``, as a linear commit 
history is
-prefered by the project.  It is much easier to maintain, and makes for easier
-``cherry-picking`` of features; useful for backporting fixes to maintenance 
releases.
+Once you have a branch going, you should sync with ``upstream/master``
+regularly, as many commits merge to master every day.
+It is recommended to use ``git rebase`` rather than ``git merge``.
 To sync your local copy of a branch, you may do the following::
 
 $ git pull upstream branch --rebase

Review comment:
   Is this doing the same as `git rebase upstream/master` ? (I am used to 
doing this)
   
   EDIT: ah, I suppose my suggestion needs a `git fetch upstream/master` first, 
so the above is probably the one-liner to do this

##
File path: docs/source/developers/contributing.rst
##
@@ -17,53 +17,73 @@
 
 .. _contributing:
 
-***
-Contribution Guidelines
-***
+
+Contributing to Apache Arrow
+
 
-There are many ways to contribute to Apache Arrow:
+Thanks for your interest in the Apache Arrow project. Arrow is a large project
+and may seem overwhelming when you're first getting involved.
+Contributing code is great, but that's probably not the first place to start.
+There are lots of ways to make valuable contributions to the project and
+community.
 
-* Contributing code (we call them "patches")
-* Writing documentation (another form of code, in