This is an automated email from the ASF dual-hosted git repository.

jorisvandenbossche pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 8394571423 ARROW-16582: [Python][Docs] Update Python build docs to 
include dataset
8394571423 is described below

commit 8394571423413265f4abc2a2b1e71814e20dfeb0
Author: Raúl Cumplido <[email protected]>
AuthorDate: Thu May 19 10:03:36 2022 +0200

    ARROW-16582: [Python][Docs] Update Python build docs to include dataset
    
    This PR aims to update the python developers guide to default building 
pyarrow with `DATASET` on. It also fixes a minor command issue as the whole 
guide refers to the build directory as `arrow/python`, `arrow/cpp`, `arrow/ci` 
instead of `/arrow`.
    
    Closes #13187 from raulcd/ARROW-16582
    
    Authored-by: Raúl Cumplido <[email protected]>
    Signed-off-by: Joris Van den Bossche <[email protected]>
---
 docs/source/developers/python.rst | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/docs/source/developers/python.rst 
b/docs/source/developers/python.rst
index c7a899e911..93971931c1 100644
--- a/docs/source/developers/python.rst
+++ b/docs/source/developers/python.rst
@@ -89,6 +89,8 @@ particular group, prepend ``only-`` instead, for example 
``--only-parquet``.
 
 The test groups currently include:
 
+* ``dataset``: Apache Arrow Dataset tests
+* ``flight``: Flight RPC tests
 * ``gandiva``: tests for Gandiva expression compiler (uses LLVM)
 * ``hdfs``: tests that use libhdfs or libhdfs3 to access the Hadoop filesystem
 * ``hypothesis``: tests that use the ``hypothesis`` module for generating
@@ -100,7 +102,6 @@ The test groups currently include:
 * ``plasma``: Plasma Object Store tests
 * ``s3``: Tests for Amazon S3
 * ``tensorflow``: Tests that involve TensorFlow
-* ``flight``: Flight RPC tests
 
 Benchmarking
 ------------
@@ -264,6 +265,7 @@ created above (stored in ``$ARROW_HOME``):
    $ cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
            -DCMAKE_INSTALL_LIBDIR=lib \
            -DCMAKE_BUILD_TYPE=Debug \
+           -DARROW_DATASET=ON \
            -DARROW_WITH_BZ2=ON \
            -DARROW_WITH_ZLIB=ON \
            -DARROW_WITH_ZSTD=ON \
@@ -283,6 +285,7 @@ There are a number of optional components that can can be 
switched ON by
 adding flags with ``ON``:
 
 * ``ARROW_CUDA``: Support for CUDA-enabled GPUs
+* ``ARROW_DATASET``: Support for Apache Arrow Dataset
 * ``ARROW_FLIGHT``: Flight RPC framework
 * ``ARROW_GANDIVA``: LLVM-based expression compiler
 * ``ARROW_ORC``: Support for Apache ORC file format
@@ -335,7 +338,7 @@ Python executable which you are using.
 For any other C++ build challenges, see :ref:`cpp-development`.
 
 In case you may need to rebuild the C++ part due to errors in the process it is
-advisable to delete the build folder with command ``rm -rf /arrow/cpp/build``.
+advisable to delete the build folder with command ``rm -rf arrow/cpp/build``.
 If the build has passed successfully and you need to rebuild due to latest pull
 from git master, then this step is not needed.
 
@@ -345,6 +348,7 @@ Now, build pyarrow:
 
    $ pushd arrow/python
    $ export PYARROW_WITH_PARQUET=1
+   $ export PYARROW_WITH_DATASET=1
    $ python setup.py build_ext --inplace
    $ popd
 

Reply via email to