HyukjinKwon commented on a change in pull request #32926: URL: https://github.com/apache/spark/pull/32926#discussion_r652404324
########## File path: python/docs/source/development/ps_contributing.rst ########## @@ -1,192 +0,0 @@ -================== -Contributing Guide -================== - -.. contents:: Table of contents: - :depth: 1 - :local: - -Types of Contributions -====================== - -The largest amount of work consists simply of implementing the pandas API using Spark's built-in functions, which is usually straightforward. But there are many different forms of contributions in addition to writing code: - -1. Use the project and provide feedback, by creating new tickets or commenting on existing relevant tickets. - -2. Review existing pull requests. - -3. Improve the project's documentation. - -4. Write blog posts or tutorial articles evangelizing pandas API on Spark and help new users learn pandas API on Spark. - -5. Give a talk about pandas API on Spark at your local meetup or a conference. - - -Step-by-step Guide For Code Contributions -========================================= - -1. Read and understand the `Design Principles <design.rst>`_ for the project. Contributions should follow these principles. - -2. Signaling your work: If you are working on something, comment on the relevant ticket that you are doing so to avoid multiple people taking on the same work at the same time. It is also a good practice to signal that your work has stalled or you have moved on and want somebody else to take over. - -3. Understand what the functionality is in pandas or in Spark. - -4. Implement the functionality, with test cases providing close to 100% statement coverage. Document the functionality. - -5. Run existing and new test cases to make sure they still pass. Also run `dev/reformat` script to reformat Python files by using `Black <https://github.com/psf/black>`_, and run the linter `dev/lint-python`. - -6. Build the docs (`make html` in `docs` directory) and verify the docs related to your change look OK. - -7. Submit a pull request, and be responsive to code review feedback from other community members. - -That's it. Your contribution, once merged, will be available in the next release. - - -Environment Setup -================= - -Conda ------ - -If you are using Conda, the pandas API on Spark installation and development environment are as follows. - -.. code-block:: bash - - # Python 3.6+ is required - conda create --name koalas-dev-env python=3.6 - conda activate koalas-dev-env - conda install -c conda-forge pyspark=2.4 - pip install -r requirements-dev.txt - pip install -e . # installs koalas from current checkout - -Once setup, make sure you switch to `koalas-dev-env` before development: - -.. code-block:: bash - - conda activate koalas-dev-env - -pip ---- - -With Python 3.6+, pip can be used as below to install and set up the development environment. - -.. code-block:: bash - - pip install pyspark==2.4 - pip install -r requirements-dev.txt - pip install -e . # installs koalas from current checkout - -Running Tests -============= - -There is a script `./dev/pytest` which is exactly same as `pytest` but with some default settings to run the tests easily. - -To run all the tests, similar to our CI pipeline: - -.. code-block:: bash - - # Run all unittest and doctest - ./dev/pytest - -To run a specific test file: - -.. code-block:: bash - - # Run unittest - ./dev/pytest -k test_dataframe.py - - # Run doctest - ./dev/pytest -k series.py --doctest-modules databricks - -To run a specific doctest/unittest: - -.. code-block:: bash - - # Run unittest - ./dev/pytest -k "DataFrameTest and test_Dataframe" - - # Run doctest - ./dev/pytest -k DataFrame.corr --doctest-modules databricks - -Note that `-k` is used for simplicity although it takes an expression. You can use `--verbose` to check what to filter. See `pytest --help` for more details. - - -Building Documentation -====================== - -To build documentation via Sphinx: - -.. code-block:: bash - - cd docs && make clean html - -It generates HTMLs under `docs/build/html` directory. Open `docs/build/html/index.html` to check if documentation is built properly. - - -Coding Conventions Review comment: Removed. duplicate with https://spark.apache.org/docs/latest/api/python/development/contributing.html#code-and-docstring-guide and https://spark.apache.org/contributing.html ########## File path: python/docs/source/development/ps_contributing.rst ########## @@ -1,192 +0,0 @@ -================== -Contributing Guide -================== - -.. contents:: Table of contents: - :depth: 1 - :local: - -Types of Contributions -====================== - -The largest amount of work consists simply of implementing the pandas API using Spark's built-in functions, which is usually straightforward. But there are many different forms of contributions in addition to writing code: - -1. Use the project and provide feedback, by creating new tickets or commenting on existing relevant tickets. - -2. Review existing pull requests. - -3. Improve the project's documentation. - -4. Write blog posts or tutorial articles evangelizing pandas API on Spark and help new users learn pandas API on Spark. - -5. Give a talk about pandas API on Spark at your local meetup or a conference. - - -Step-by-step Guide For Code Contributions -========================================= - -1. Read and understand the `Design Principles <design.rst>`_ for the project. Contributions should follow these principles. - -2. Signaling your work: If you are working on something, comment on the relevant ticket that you are doing so to avoid multiple people taking on the same work at the same time. It is also a good practice to signal that your work has stalled or you have moved on and want somebody else to take over. - -3. Understand what the functionality is in pandas or in Spark. - -4. Implement the functionality, with test cases providing close to 100% statement coverage. Document the functionality. - -5. Run existing and new test cases to make sure they still pass. Also run `dev/reformat` script to reformat Python files by using `Black <https://github.com/psf/black>`_, and run the linter `dev/lint-python`. - -6. Build the docs (`make html` in `docs` directory) and verify the docs related to your change look OK. - -7. Submit a pull request, and be responsive to code review feedback from other community members. - -That's it. Your contribution, once merged, will be available in the next release. - - -Environment Setup -================= - -Conda ------ - -If you are using Conda, the pandas API on Spark installation and development environment are as follows. - -.. code-block:: bash - - # Python 3.6+ is required - conda create --name koalas-dev-env python=3.6 - conda activate koalas-dev-env - conda install -c conda-forge pyspark=2.4 - pip install -r requirements-dev.txt - pip install -e . # installs koalas from current checkout - -Once setup, make sure you switch to `koalas-dev-env` before development: - -.. code-block:: bash - - conda activate koalas-dev-env - -pip ---- - -With Python 3.6+, pip can be used as below to install and set up the development environment. - -.. code-block:: bash - - pip install pyspark==2.4 - pip install -r requirements-dev.txt - pip install -e . # installs koalas from current checkout - -Running Tests -============= - -There is a script `./dev/pytest` which is exactly same as `pytest` but with some default settings to run the tests easily. - -To run all the tests, similar to our CI pipeline: - -.. code-block:: bash - - # Run all unittest and doctest - ./dev/pytest - -To run a specific test file: - -.. code-block:: bash - - # Run unittest - ./dev/pytest -k test_dataframe.py - - # Run doctest - ./dev/pytest -k series.py --doctest-modules databricks - -To run a specific doctest/unittest: - -.. code-block:: bash - - # Run unittest - ./dev/pytest -k "DataFrameTest and test_Dataframe" - - # Run doctest - ./dev/pytest -k DataFrame.corr --doctest-modules databricks - -Note that `-k` is used for simplicity although it takes an expression. You can use `--verbose` to check what to filter. See `pytest --help` for more details. - - -Building Documentation -====================== - -To build documentation via Sphinx: - -.. code-block:: bash - - cd docs && make clean html - -It generates HTMLs under `docs/build/html` directory. Open `docs/build/html/index.html` to check if documentation is built properly. - - -Coding Conventions -================== - -We follow `PEP 8 <https://www.python.org/dev/peps/pep-0008/>`_ with one exception: lines can be up to 100 characters in length, not 79. - -Doctest Conventions Review comment: Moved and merged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
