[arrow] branch master updated: ARROW-4339: [C++][Python] Developer documentation overhaul for 0.13 release

wesm Sun, 17 Mar 2019 14:27:03 -0700

This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git



The following commit(s) were added to refs/heads/master by this push:
     new d94a9fc  ARROW-4339: [C++][Python] Developer documentation overhaul 
for 0.13 release
d94a9fc is described below

commit d94a9fcee801d9e185f36f767bb5b70566df70ff
Author: Wes McKinney <[email protected]>
AuthorDate: Sun Mar 17 16:26:34 2019 -0500

    ARROW-4339: [C++][Python] Developer documentation overhaul for 0.13 release
    
    This was pretty much a huge pain but addresses accumulated documentation 
debt after the conda compiler migration and the CMake refactor. I suggest we 
not stress too much over small details on this and do more work to improve 
these docs in follow up PRs. I did the best I could under the circumstances and 
need to move on to other things now
    
    I think the overall organization of the Sphinx project for developers is 
much improved, take a look (I will post a link to a published version for 
review)
    
    JIRAs addressed by this PR and other things I did
    
    * Update cpp/thirdparty/README.md given CMake refactor (this was totally 
out of date). This now directs users to the Sphinx C++ developer guide
    
    * ARROW-4339: Move cpp/README.md to Sphinx documentation (and clean it up a 
lot!!)
    * ARROW-4425: Move Contributing Guidelines from Confluence to Sphinx, 
update top level README
    * ARROW-4232: Remove references to pre-gcc5 ABI issues
    * ARROW-4165: Move Windows C++ developer guide to Sphinx (from 
cpp/apidoc/Windows.md)
    * ARROW-4547: Update Python development instructions re: producing 
CUDA-enabled pyarrow
    * ARROW-4326 / ARROW-3096: Update Python build instructions re: January 
2019 compiler migration
    
    Author: Wes McKinney <[email protected]>
    
    Closes #3942 from wesm/developer-docs-0.13 and squashes the following 
commits:
    
    a3c3dd5de <Wes McKinney> Add some Boost info, misc cleaning
    2ccc3de18 <Wes McKinney> Remove index.md altogether
    66da97e7f <Wes McKinney> Remove unused text from cpp/apidoc/index.md
    504bc134e <Wes McKinney> restore 'what's in the arrow libraries' section
    8d1f33e19 <Wes McKinney> Finish initial documentation revamp for 0.13, 
stopping here
    84dd680a2 <Wes McKinney> Some docs reorg, begin rewriting cpp/README.md 
into docs/source/developers/cpp.rst
---
 README.md                                          |  38 +-
 ci/conda_env_cpp.yml                               |   2 +-
 cpp/README.md                                      | 550 +------------
 cpp/apidoc/Windows.md                              | 291 -------
 cpp/apidoc/index.md                                |  42 -
 cpp/thirdparty/README.md                           |  90 +-
 docs/README.md                                     |   2 +-
 docs/source/developers/contributing.rst            |  88 ++
 docs/source/developers/cpp.rst                     | 913 +++++++++++++++++++++
 docs/source/developers/documentation.rst           |   2 +-
 docs/source/developers/index.rst                   |   6 +-
 docs/source/developers/integration.rst             |   2 +
 .../development.rst => developers/python.rst}      | 227 +++--
 docs/source/index.rst                              |  30 +-
 docs/source/python/benchmarks.rst                  |   2 +
 docs/source/python/index.rst                       |   1 -
 docs/source/python/install.rst                     |   2 +-
 docs/source/python/parquet.rst                     |   6 +-
 python/README.md                                   |  49 +-
 19 files changed, 1194 insertions(+), 1149 deletions(-)

diff --git a/README.md b/README.md
index 621e119..24157b3 100644
--- a/README.md
+++ b/README.md
@@ -59,7 +59,7 @@ The reference Arrow libraries contain a number of distinct 
software components:
   library)
 - Reference-counted off-heap buffer memory management, for zero-copy memory
   sharing and handling memory-mapped files
-- Low-overhead IO interfaces to files on disk, HDFS (C++ only)
+- IO interfaces to local and remote filesystems
 - Self-describing binary wire formats (streaming and batch/file-like) for
   remote procedure calls (RPC) and
   interprocess communication (IPC)
@@ -67,6 +67,10 @@ The reference Arrow libraries contain a number of distinct 
software components:
   implementations (e.g. sending data from Java to C++)
 - Conversions to and from other in-memory data structures
 
+## How to Contribute
+
+Please read our latest [project contribution guide][5].
+
 ## Getting involved
 
 Even if you do not plan to contribute to Apache Arrow itself or Arrow
@@ -79,38 +83,8 @@ integrations in other projects, we'd be happy to have you 
involved:
 - [Learn the format][2]
 - Contribute code to one of the reference implementations
 
-## How to Contribute
-
-We prefer to receive contributions in the form of GitHub pull requests. Please
-send pull requests against the [github.com/apache/arrow][4] repository.
-
-If you are looking for some ideas on what to contribute, check out the [JIRA
-issues][3] for the Apache Arrow project. Comment on the issue and/or contact
-[[email protected]](http://mail-archives.apache.org/mod_mbox/arrow-dev/)
-with your questions and ideas.
-
-If you’d like to report a bug but don’t have time to fix it, you can still post
-it on JIRA, or email the mailing list
-[[email protected]](http://mail-archives.apache.org/mod_mbox/arrow-dev/)
-
-To contribute a patch:
-
-1. Break your work into small, single-purpose patches if possible. It’s much
-harder to merge in a large change with a lot of disjoint features.
-2. Create a JIRA for your patch on the [Arrow Project
-JIRA](https://issues.apache.org/jira/browse/ARROW).
-3. Submit the patch as a GitHub pull request against the master branch. For a
-tutorial, see the GitHub guides on forking a repo and sending a pull
-request. Prefix your pull request name with the JIRA name (ex:
-https://github.com/apache/arrow/pull/240).
-4. Make sure that your code passes the unit tests. You can find instructions
-how to run the unit tests for each Arrow component in its respective README
-file.
-5. Add new unit tests for your code.
-
-Thank you in advance for your contributions!
-
 [1]: mailto:[email protected]
 [2]: https://github.com/apache/arrow/tree/master/format
 [3]: https://issues.apache.org/jira/browse/ARROW
 [4]: https://github.com/apache/arrow
+[5]: 
https://github.com/apache/arrow/blob/master/docs/source/developers/contributing.rst
\ No newline at end of file
diff --git a/ci/conda_env_cpp.yml b/ci/conda_env_cpp.yml
index 88e7d95..e27b5bf 100644
--- a/ci/conda_env_cpp.yml
+++ b/ci/conda_env_cpp.yml
@@ -38,6 +38,6 @@ python
 rapidjson
 re2
 snappy
-thrift-cpp
+thrift-cpp=0.12.0
 zlib
 zstd
diff --git a/cpp/README.md b/cpp/README.md
index cbb5221..8d29da1 100644
--- a/cpp/README.md
+++ b/cpp/README.md
@@ -17,554 +17,18 @@
   under the License.
 -->
 
-# Apache Arrow C++ codebase
+# Apache Arrow C++
 
 This directory contains the code and build system for the Arrow C++ libraries,
 as well as for the C++ libraries for Apache Parquet.
 
-## System setup
+## Installation
 
-Arrow uses CMake as a build configuration system. Currently, it supports
-in-source and out-of-source builds with the latter one being preferred.
+See http://arrow.apache.org/install/ for the latest instructions how to install
+pre-compiled binary versions of the library.
 
-Building Arrow requires:
+## Source Builds and Development
 
-* A C++11-enabled compiler. On Linux, gcc 4.8 and higher should be sufficient.
-* CMake 3.2 or higher
-* Boost
-* Bison/flex (for building Apache Thrift from source only,
-a parquet dependency.)
+Please refer to our latest [C++ Development Documentation][1].
 
-Testing arrow with ctest requires:
-
-* python
-
-On Ubuntu/Debian you can install the requirements with:
-
-```shell
-sudo apt-get install \
-     autoconf \
-     build-essential \
-     cmake \
-     libboost-dev \
-     libboost-filesystem-dev \
-     libboost-regex-dev \
-     libboost-system-dev \
-     python \
-     bison \
-     flex
-```
-
-On Alpine Linux:
-
-```shell
-apk add autoconf \
-        bash \
-        boost-dev \
-        cmake \
-        g++ \
-        gcc \
-        make
-```
-
-On macOS, you can use [Homebrew][1]:
-
-```shell
-git clone https://github.com/apache/arrow.git
-cd arrow
-brew update && brew bundle --file=c_glib/Brewfile
-```
-
-If you are developing on Windows, see the [Windows developer guide][2].
-
-## Building Arrow
-
-Simple release build:
-
-    git clone https://github.com/apache/arrow.git
-    cd arrow/cpp
-    mkdir release
-    cd release
-    cmake -DARROW_BUILD_TESTS=ON  ..
-    make unittest
-
-Simple debug build:
-
-    git clone https://github.com/apache/arrow.git
-    cd arrow/cpp
-    mkdir debug
-    cd debug
-    cmake -DCMAKE_BUILD_TYPE=Debug -DARROW_BUILD_TESTS=ON ..
-    make unittest
-
-If you do not need to build the test suite, you can omit the
-`ARROW_BUILD_TESTS` option (the default is not to build the unit tests).
-
-Detailed unit test logs will be placed in the build directory under
-`build/test-logs`.
-
-On some Linux distributions, running the test suite might require setting an
-explicit locale. If you see any locale-related errors, try setting the
-environment variable (which requires the `locales` package or equivalent):
-
-```
-export LC_ALL="en_US.UTF-8"
-```
-
-## Modular Build Targets
-
-Since there are several major parts of the C++ project, we have provided
-modular CMake targets for building each library component, group of unit tests
-and benchmarks, and their dependencies:
-
-* `make arrow` for Arrow core libraries
-* `make parquet` for Parquet libraries
-* `make gandiva` for Gandiva (LLVM expression compiler) libraries
-* `make plasma` for Plasma libraries, server
-
-To build the unit tests or benchmarks, add `-tests` or `-benchmarks` to the
-target name. So `make arrow-tests` will build the Arrow core unit tests. Using
-the `-all` target, e.g. `parquet-all`, will build everything.
-
-If you wish to only build and install one or more project subcomponents, we
-have provided the CMake option `ARROW_OPTIONAL_INSTALL` to only install targets
-that have been built. For example, if you only wish to build the Parquet
-libraries, its tests, and its dependencies, you can run:
-
-```
-cmake .. -DARROW_PARQUET=ON -DARROW_OPTIONAL_INSTALL=ON -DARROW_BUILD_TESTS=ON
-make parquet
-make install
-```
-
-If you omit an explicit target when invoking `make`, all targets will be built.
-
-## Parquet Development Notes
-
-To build the C++ libraries for Apache Parquet, add the flag
-`-DARROW_PARQUET=ON` when invoking CMake. The Parquet libraries and unit tests
-can be built with the `parquet` make target:
-
-```shell
-make parquet
-```
-
-Running `ctest -L unittest` will run all built C++ unit tests, while `ctest -L
-parquet` will run only the Parquet unit tests. The unit tests depend on an
-environment variable `PARQUET_TEST_DATA` that depends on a git submodule to the
-repository https://github.com/apache/parquet-testing:
-
-```shell
-git submodule update --init
-export PARQUET_TEST_DATA=$ARROW_ROOT/cpp/submodules/parquet-testing/data
-```
-
-Here `$ARROW_ROOT` is the absolute path to the Arrow codebase.
-
-### Statically linking to Arrow on Windows
-
-The Arrow headers on Windows static library builds (enabled by the CMake
-option `ARROW_BUILD_STATIC`) use the preprocessor macro `ARROW_STATIC` to
-suppress dllimport/dllexport marking of symbols. Projects that statically link
-against Arrow on Windows additionally need this definition. The Unix builds do
-not use the macro.
-
-### Building/Running benchmarks
-
-Follow the directions for simple build except run cmake
-with the `--ARROW_BUILD_BENCHMARKS` parameter set correctly:
-
-    cmake -DARROW_BUILD_TESTS=ON -DARROW_BUILD_BENCHMARKS=ON ..
-
-and instead of make unittest run either `make; ctest` to run both unit tests
-and benchmarks or `make benchmark` to run only the benchmark tests.
-
-Benchmark logs will be placed in the build directory under 
`build/benchmark-logs`.
-
-### Testing with LLVM AddressSanitizer
-
-To use AddressSanitizer (ASAN) to find bad memory accesses or leaks with LLVM,
-pass `-DARROW_USE_ASAN=ON` when building. You must use clang to compile with
-ASAN, and `ARROW_USE_ASAN` is mutually-exclusive with the valgrind option
-`ARROW_TEST_MEMCHECK`.
-
-### Building/Running fuzzers
-
-Fuzzers can help finding unhandled exceptions and problems with untrusted input
-that may lead to crashes, security issues and undefined behavior. They do this
-by generating random input data and observing the behavior of the executed
-code. To build the fuzzer code, LLVM is required (GCC-based compilers won't
-work). You can build them using the following code:
-
-    cmake -DARROW_FUZZING=ON -DARROW_USE_ASAN=ON ..
-
-`ARROW_FUZZING` will enable building of fuzzer executables as well as enable 
the
-addition of coverage helpers via `ARROW_USE_COVERAGE`, so that the fuzzer can 
observe
-the program execution.
-
-It is also wise to enable some sanitizers like `ARROW_USE_ASAN` (see above), 
which
-activates the address sanitizer. This way, we ensure that bad memory operations
-provoked by the fuzzer will be found early. You may also enable other 
sanitizers as
-well. Just keep in mind that some of them do not work together and some may 
result
-in very long execution times, which will slow down the fuzzing procedure.
-
-Now you can start one of the fuzzer, e.g.:
-
-    ./debug/debug/ipc-fuzzing-test
-
-This will try to find a malformed input that crashes the payload and will show 
the
-stack trace as well as the input data. After a problem was found this way, it 
should
-be reported and fixed. Usually, the fuzzing process cannot be continued until 
the
-fix is applied, since the fuzzer usually converts to the problem again.
-
-If you build fuzzers with ASAN, you need to set the `ASAN_SYMBOLIZER_PATH`
-environment variable to the absolute path of `llvm-symbolizer`, which is a tool
-that ships with LLVM.
-
-```shell
-export ASAN_SYMBOLIZER_PATH=$(type -p llvm-symbolizer)
-```
-
-Note that some fuzzer builds currently reject paths with a version qualifier
-(like `llvm-sanitizer-5.0`). To overcome this, set an appropriate symlink
-(here, when using LLVM 5.0):
-
-```shell
-ln -sf /usr/bin/llvm-sanitizer-5.0 /usr/bin/llvm-sanitizer
-```
-
-There are some problems that may occur during the compilation process:
-
-- libfuzzer was not distributed with your LLVM: `ld: file not found: 
.../libLLVMFuzzer.a`
-- your LLVM is too old: `clang: error: unsupported argument 'fuzzer' to option 
'fsanitize='`
-
-### Third-party dependencies and configuration
-
-Arrow depends on a number of third-party libraries. We support these in a few
-ways:
-
-* Building dependencies from source by downloading archives from the internet
-* Building dependencies from source using from local archives (to allow offline
-  builds)
-* Building with locally-installed libraries
-
-See [thirdparty/README.md][5] for details about these options and how to
-configure your build toolchain.
-
-### Building Python integration library (optional)
-
-The optional `arrow_python` shared library can be built by passing
-`-DARROW_PYTHON=on` to CMake. This must be installed or in your library load
-path to be able to build pyarrow, the Arrow Python bindings.
-
-The Python library must be built against the same Python version for which you
-are building pyarrow, e.g. Python 2.7 or Python 3.6. NumPy must also be
-installed.
-
-### Building CUDA extension library (optional)
-
-The optional `arrow_cuda` shared library can be built by passing
-`-DARROW_CUDA=on`. This requires a CUDA installation to build, and to use many
-of the functions you must have a functioning CUDA-compatible GPU.
-
-The CUDA toolchain used to build the library can be customized by using the
-`$CUDA_HOME` environment variable.
-
-This library is still in Alpha stages, and subject to API changes without
-deprecation warnings.
-
-### Building Apache ORC integration (optional)
-
-The optional arrow reader for the Apache ORC format (found in the
-`arrow::adapters::orc` namespace) can be built by passing `-DARROW_ORC=on`.
-This is currently not supported on windows. Note that this functionality is
-still in Alpha stages, and subject to API changes without deprecation warnings.
-
-### Building and developing Gandiva (optional)
-
-The Gandiva library supports compiling and evaluating expressions on arrow
-data. It uses LLVM for doing just-in-time compilation of the expressions.
-
-In addition to the arrow dependencies, gandiva requires :
-* On linux, gcc 4.9 or higher C++11-enabled compiler.
-* LLVM
-
-On Ubuntu/Debian you can install these requirements with:
-
-```shell
-sudo apt-add-repository -y "deb http://llvm.org/apt/trusty/ 
llvm-toolchain-trusty-7.0 main"
-sudo apt-get update -qq
-sudo apt-get install llvm-7.0-dev
-```
-
-On macOS, you can use [Homebrew][1]:
-
-```shell
-brew install llvm@7
-```
-
-The optional `gandiva` libraries and tests can be built by passing
-`-DARROW_GANDIVA=on`.
-
-```shell
-cmake .. -DARROW_GANDIVA=ON -DARROW_BUILD_TESTS=ON
-make
-ctest -L gandiva
-```
-
-This library is still in Alpha stages, and subject to API changes without
-deprecation warnings.
-
-### Building and developing Flight (optional)
-
-In addition to the Arrow dependencies, Flight requires:
-* gRPC (>= 1.14, roughly)
-* Protobuf (>= 3.6, earlier versions may work)
-* c-ares (used by gRPC)
-
-By default, Arrow will try to download and build these dependencies
-when building Flight.
-
-The optional `flight` libraries and tests can be built by passing
-`-DARROW_FLIGHT=ON`.
-
-```shell
-cmake .. -DARROW_FLIGHT=ON -DARROW_BUILD_TESTS=ON
-make
-```
-
-You can also use existing installations of the extra dependencies.
-When building, set the environment variables `GRPC_HOME` and/or
-`PROTOBUF_HOME` and/or `CARES_HOME`.
-
-You may try using system libraries for gRPC and Protobuf, but these
-are likely to be too old.
-
-On Ubuntu/Debian, you can try:
-
-```shell
-sudo apt-get install libgrpc-dev libgrpc++-dev protobuf-compiler-grpc 
libc-ares-dev
-```
-
-Note that the version of gRPC in Ubuntu 18.10 is too old; you will
-have to install gRPC from source. (Ubuntu 19.04/Debian Sid may work.)
-
-On macOS, you can try [Homebrew][1]:
-
-```shell
-brew install grpc
-```
-
-You can also install gRPC from source. In this case, you must install
-gRPC to generate the necessary files for CMake to find gRPC:
-
-```shell
-cmake -DgRPC_INSTALL=ON -DgRPC_BUILD_TESTS=OFF 
-DgRPC_PROTOBUF_PROVIDER=package -DgRPC_ZLIB_PROVIDER=package 
-DgRPC_CARES_PROVIDER=package -DgRPC_SSL_PROVIDER=package
-```
-
-You can then specify `-DgRPC_DIR` to `cmake`.
-
-### API documentation
-
-To generate the (html) API documentation, run the following command in the 
apidoc
-directory:
-
-    doxygen Doxyfile
-
-This requires [Doxygen](http://www.doxygen.org) to be installed.
-
-## Development
-
-This project follows [Google's C++ Style Guide][3] with minor exceptions:
-
-  *  We relax the line length restriction to 90 characters.
-  *  We use the NULLPTR macro defined in `src/arrow/util/macros.h` to
-     support building C++/CLI (ARROW-1134)
-  *  We use doxygen style comments ("///") instead of line comments ("//")
-     in header files.
-
-### Memory Pools
-
-We provide a default memory pool with `arrow::default_memory_pool()`. As a
-matter of convenience, some of the array builder classes have constructors
-which use the default pool without explicitly passing it. You can disable these
-constructors in your application (so that you are accounting properly for all
-memory allocations) by defining `ARROW_NO_DEFAULT_MEMORY_POOL`.
-
-### Header files
-
-We use the `.h` extension for C++ header files. Any header file name not
-containing `internal` is considered to be a public header, and will be
-automatically installed by the build.
-
-### Error Handling and Exceptions
-
-For error handling, we use `arrow::Status` values instead of throwing C++
-exceptions. Since the Arrow C++ libraries are intended to be useful as a
-component in larger C++ projects, using `Status` objects can help with good
-code hygiene by making explicit when a function is expected to be able to fail.
-
-For expressing invariants and "cannot fail" errors, we use DCHECK macros
-defined in `arrow/util/logging.h`. These checks are disabled in release builds
-and are intended to catch internal development errors, particularly when
-refactoring. These macros are not to be included in any public header files.
-
-Since we do not use exceptions, we avoid doing expensive work in object
-constructors. Objects that are expensive to construct may often have private
-constructors, with public static factory methods that return `Status`.
-
-There are a number of object constructors, like `arrow::Schema` and
-`arrow::RecordBatch` where larger STL container objects like `std::vector` may
-be created. While it is possible for `std::bad_alloc` to be thrown in these
-constructors, the circumstances where they would are somewhat esoteric, and it
-is likely that an application would have encountered other more serious
-problems prior to having `std::bad_alloc` thrown in a constructor.
-
-### Extra debugging help
-
-If you use the CMake option `-DARROW_EXTRA_ERROR_CONTEXT=ON` it will compile
-the libraries with extra debugging information on error checks inside the
-`RETURN_NOT_OK` macro. In unit tests with `ASSERT_OK`, this will yield error
-outputs like:
-
-
-```
-../src/arrow/ipc/ipc-read-write-test.cc:609: Failure
-Failed
-NotImplemented: ../src/arrow/ipc/ipc-read-write-test.cc:574 code: 
writer->WriteRecordBatch(batch)
-../src/arrow/ipc/writer.cc:778 code: CheckStarted()
-../src/arrow/ipc/writer.cc:755 code: schema_writer.Write(&dictionaries_)
-../src/arrow/ipc/writer.cc:730 code: WriteSchema()
-../src/arrow/ipc/writer.cc:697 code: WriteSchemaMessage(schema_, 
dictionary_memo_, &schema_fb)
-../src/arrow/ipc/metadata-internal.cc:651 code: SchemaToFlatbuffer(fbb, 
schema, dictionary_memo, &fb_schema)
-../src/arrow/ipc/metadata-internal.cc:598 code: FieldToFlatbuffer(fbb, 
*schema.field(i), dictionary_memo, &offset)
-../src/arrow/ipc/metadata-internal.cc:508 code: TypeToFlatbuffer(fbb, 
*field.type(), &children, &layout, &type_enum, dictionary_memo, &type_offset)
-Unable to convert type: decimal(19, 4)
-```
-
-### Deprecations and API Changes
-
-We use the compiler definition `ARROW_NO_DEPRECATED_API` to disable APIs that
-have been deprecated. It is a good practice to compile third party applications
-with this flag to proactively catch and account for API changes.
-
-### Keeping includes clean with include-what-you-use
-
-We have provided a `build-support/iwyu/iwyu.sh` convenience script for invoking
-Google's [include-what-you-use][4] tool, also known as IWYU. This includes
-various suppressions for more informative output. After building IWYU
-(following instructions in the README), you can run it on all files by running:
-
-```shell
-CC="clang-4.0" CXX="clang++-4.0" cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..
-../build-support/iwyu/iwyu.sh all
-```
-
-This presumes that `include-what-you-use` and `iwyu_tool.py` are in your
-`$PATH`. If you compiled IWYU using a different version of clang, then
-substitute the version number above accordingly.
-
-We have provided a Docker-based IWYU to make it easier to run these
-checks. This can be run using the docker-compose setup in the `dev/` directory
-
-```shell
-# If you have not built the base image already
-docker build -t arrow_integration_xenial_base -f 
dev/docker_common/Dockerfile.xenial.base .
-
-dev/run_docker_compose.sh iwyu
-```
-
-### Linting
-
-We require that you follow a certain coding style in the C++ code base.
-You can check your code abides by that coding style by running:
-
-    make lint
-
-You can also fix any formatting errors automatically:
-
-    make format
-
-These commands require `clang-format-6.0` (and not any other version).
-You may find the required packages at http://releases.llvm.org/download.html
-or use the Debian/Ubuntu APT repositories on https://apt.llvm.org/. On macOS
-with [Homebrew][1] you can get it via `brew install llvm@6`.
-
-Depending on how you installed clang-format, the build system may not be able
-to find it. You can provide an explicit path to your LLVM installation (or the
-root path for the clang tools) with the environment variable
-`$CLANG_TOOLS_PATH` or by passing `-DClangTools_PATH=$PATH_TO_CLANG_TOOLS` when
-invoking CMake.
-
-Additionally, all CMake files should go through an automatic formatter.
-You'll need Python 3 and 
[cmake_format](https://github.com/cheshirekow/cmake_format)
-installed.  Then in the top-level directory run the `run-cmake-format.py`
-script.
-
-
-## Checking for ABI and API stability
-
-To build ABI compliance reports, you need to install the two tools
-`abi-dumper` and `abi-compliance-checker`.
-
-Build Arrow C++ in Debug mode, alternatively you could use `-Og` which also
-builds with the necessary symbols but includes a bit of code optimization.
-Once the build has finished, you can generate ABI reports using:
-
-```
-abi-dumper -lver 9 debug/libarrow.so -o ABI-9.dump
-```
-
-The above version number is freely selectable. As we want to compare versions,
-you should now `git checkout` the version you want to compare it to and re-run
-the above command using a different version number. Once both reports are
-generated, you can build a comparision report using
-
-```
-abi-compliance-checker -l libarrow -d1 ABI-PY-9.dump -d2 ABI-PY-10.dump
-```
-
-The report is then generated in `compat_reports/libarrow` as a HTML.
-
-## Continuous Integration
-
-Pull requests are run through travis-ci for continuous integration.  You can 
avoid
-build failures by running the following checks before submitting your pull 
request:
-
-    make unittest
-    make lint
-    # The next command may change your code.  It is recommended you commit
-    # before running it.
-    make format # requires clang-format is installed
-
-We run our CI builds with more compiler warnings enabled for the Clang
-compiler. Please run CMake with
-
-`-DBUILD_WARNING_LEVEL=CHECKIN`
-
-to avoid failures due to compiler warnings.
-
-Note that the clang-tidy target may take a while to run.  You might consider
-running clang-tidy separately on the files you have added/changed before
-invoking the make target to reduce iteration time.  Also, it might generate 
warnings
-that aren't valid.  To avoid these you can add a line comment `// NOLINT`. If
-NOLINT doesn't suppress the warnings, you add the file in question to
-the .clang-tidy-ignore file.  This will allow `make check-clang-tidy` to pass 
in
-travis-CI (but still surface the potential warnings in `make clang-tidy`). 
Ideally,
-both of these options would be used rarely. Current known uses-cases when they 
are required:
-
-*  Parameterized tests in google test.
-
-## CMake version requirements
-
-We support CMake 3.2 and higher. Some features require a newer version of 
CMake:
-
-* Building the benchmarks requires 3.6 or higher
-* Building zstd from source requires 3.7 or higher
-* Building Gandiva JNI bindings requires 3.11 or higher
-
-[1]: https://brew.sh/
-[2]: https://github.com/apache/arrow/blob/master/cpp/apidoc/Windows.md
-[3]: https://google.github.io/styleguide/cppguide.html
-[4]: https://github.com/include-what-you-use/include-what-you-use
-[5]: https://github.com/apache/arrow/blob/master/cpp/thirdparty/README.md
+[1]: https://github.com/apache/arrow/blob/master/docs/source/developers/cpp.rst
diff --git a/cpp/apidoc/Windows.md b/cpp/apidoc/Windows.md
deleted file mode 100644
index 58c6fb1..0000000
--- a/cpp/apidoc/Windows.md
+++ /dev/null
@@ -1,291 +0,0 @@
-<!---
-  Licensed to the Apache Software Foundation (ASF) under one
-  or more contributor license agreements.  See the NOTICE file
-  distributed with this work for additional information
-  regarding copyright ownership.  The ASF licenses this file
-  to you under the Apache License, Version 2.0 (the
-  "License"); you may not use this file except in compliance
-  with the License.  You may obtain a copy of the License at
-
-    http://www.apache.org/licenses/LICENSE-2.0
-
-  Unless required by applicable law or agreed to in writing,
-  software distributed under the License is distributed on an
-  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-  KIND, either express or implied.  See the License for the
-  specific language governing permissions and limitations
-  under the License.
--->
-
-# Developing Arrow C++ on Windows
-
-## System setup, conda, and conda-forge
-
-Since some of the Arrow developers work in the Python ecosystem, we are
-investing time in maintaining the thirdparty build dependencies for Arrow and
-related C++ libraries using the conda package manager. Others are free to add
-other development instructions for Windows here.
-
-### conda and package toolchain
-
-[Miniconda][1] is a minimal Python distribution including the conda package
-manager. To get started, download and install a 64-bit distribution.
-
-We recommend using packages from [conda-forge][2].
-Launch cmd.exe and run following commands:
-
-```shell
-conda config --add channels conda-forge
-```
-
-Now, you can bootstrap a build environment (call from the root directory of the
-Arrow codebase):
-
-```shell
-conda create -n arrow-dev --file=ci\conda_env_cpp.yml
-```
-
-> **Note:** Make sure to get the `conda-forge` build of `gflags` as the
-> naming of the library differs from that in the `defaults` channel.
-
-Activate just created conda environment with pre-installed packages from
-previous step:
-
-```shell
-activate arrow-dev
-```
-
-We are using the [cmake][4] tool to support Windows builds.
-To allow cmake to pick up 3rd party dependencies, you should set
-`ARROW_BUILD_TOOLCHAIN` environment variable to contain `Library` folder
-path of new created on previous step `arrow-dev` conda environment.
-
-To set `ARROW_BUILD_TOOLCHAIN` environment variable visible only for current 
terminal
-session you can run following. `%CONDA_PREFIX` is set by conda to the current 
environment
-root by the `activate` script.
-```shell
-set ARROW_BUILD_TOOLCHAIN=%CONDA_PREFIX%\Library
-```
-
-To validate value of `ARROW_BUILD_TOOLCHAIN` environment variable you can run 
following terminal command:
-```shell
-echo %ARROW_BUILD_TOOLCHAIN%
-```
-
-As alternative to `ARROW_BUILD_TOOLCHAIN`, it's possible to configure path
-to each 3rd party dependency separately by setting appropriate environment
-variable:
-
-`FLATBUFFERS_HOME` variable with path to `flatbuffers` installation
-`RAPIDJSON_HOME` variable with path to `rapidjson` installation
-`GFLAGS_HOME` variable with path to `gflags` installation
-`SNAPPY_HOME` variable with path to `snappy` installation
-`ZLIB_HOME` variable with path to `zlib` installation
-`BROTLI_HOME` variable with path to `brotli` installation
-`LZ4_HOME` variable with path to `lz4` installation
-`ZSTD_HOME` variable with path to `zstd` installation
-
-### Customize static libraries names lookup of 3rd party dependencies
-
-If you decided to use pre-built 3rd party dependencies libs, it's possible to
-configure Arrow's cmake build script to search for customized names of 3rd
-party static libs.
-
-`brotli`. Set `BROTLI_HOME` environment variable. Pass
-`-DBROTLI_MSVC_STATIC_LIB_SUFFIX=%BROTLI_SUFFIX%` to link with
-brotli*%BROTLI_SUFFIX%.lib. For brotli versions <= 0.6.0 installed from
-conda-forge this must be set to `_static`, otherwise the default of `-static`
-is used.
-
-`snappy`. Set `SNAPPY_HOME` environment variable. Pass
-`-DSNAPPY_MSVC_STATIC_LIB_SUFFIX=%SNAPPY_SUFFIX%` to link with
-snappy%SNAPPY_SUFFIX%.lib.
-
-`lz4`. Set `LZ4_HOME` environment variable. Pass
-`-LZ4_MSVC_STATIC_LIB_SUFFIX=%LZ4_SUFFIX%` to link with
-lz4%LZ4_SUFFIX%.lib.
-
-`zstd`. Set `ZSTD_HOME` environment variable. Pass
-`-ZSTD_MSVC_STATIC_LIB_SUFFIX=%ZSTD_SUFFIX%` to link with
-zstd%ZSTD_SUFFIX%.lib.
-
-### Visual Studio
-
-Microsoft provides the free Visual Studio Community edition. When doing
-development, you must launch the developer command prompt using:
-
-#### Visual Studio 2015
-
-```
-"C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\vcvarsall.bat" amd64
-```
-
-#### Visual Studio 2017
-
-```
-"C:\Program Files (x86)\Microsoft Visual 
Studio\2017\Community\Common7\Tools\VsDevCmd.bat" -arch=amd64
-```
-
-It's easiest to configure a console emulator like [cmder][3] to automatically
-launch this when starting a new development console.
-
-## Building with Ninja and clcache
-
-We recommend the [Ninja](https://ninja-build.org/) build system for better
-build parallelization, and the optional
-[clcache](https://github.com/frerich/clcache/) compiler cache which keeps
-track of past compilations to avoid running them over and over again
-(in a way similar to the Unix-specific "ccache").
-
-Activate your conda build environment to first install those utilities:
-
-```shell
-activate arrow-dev
-
-conda install -c conda-forge ninja
-pip install git+https://github.com/frerich/clcache.git
-```
-
-Change working directory in cmd.exe to the root directory of Arrow and
-do an out of source build by generating Ninja files:
-
-```shell
-cd cpp
-mkdir build
-cd build
-cmake -G "Ninja" -DCMAKE_BUILD_TYPE=Release ..
-cmake --build . --config Release
-```
-
-## Building with NMake
-
-Activate your conda build environment:
-
-```shell
-activate arrow-dev
-```
-
-Change working directory in cmd.exe to the root directory of Arrow and
-do an out of source build using `nmake`:
-
-```shell
-cd cpp
-mkdir build
-cd build
-cmake -G "NMake Makefiles" -DCMAKE_BUILD_TYPE=Release ..
-cmake --build . --config Release
-nmake
-```
-
-When using conda, only release builds are currently supported.
-
-## Building using Visual Studio (MSVC) Solution Files
-
-Activate your conda build environment:
-
-```shell
-activate arrow-dev
-```
-
-Change working directory in cmd.exe to the root directory of Arrow and
-do an out of source build by generating a MSVC solution:
-
-```shell
-cd cpp
-mkdir build
-cd build
-cmake -G "Visual Studio 14 2015 Win64" -DCMAKE_BUILD_TYPE=Release ..
-cmake --build . --config Release
-```
-
-## Debug build
-
-To build Debug version of Arrow you should have pre-installed a Debug version
-of boost libs.
-
-It's recommended to configure cmake build with the following variables for
-Debug build:
-
-`-DARROW_BOOST_USE_SHARED=OFF` - enables static linking with boost debug libs 
and
-simplifies run-time loading of 3rd parties. (Recommended)
-
-`-DBOOST_ROOT` - sets the root directory of boost libs. (Optional)
-
-`-DBOOST_LIBRARYDIR` - sets the directory with boost lib files. (Optional)
-
-Command line to build Arrow in Debug might look as following:
-
-```shell
-cd cpp
-mkdir build
-cd build
-cmake -G "Visual Studio 14 2015 Win64" ^
-      -DARROW_BOOST_USE_SHARED=OFF ^
-      -DCMAKE_BUILD_TYPE=Debug ^
-      -DBOOST_ROOT=C:/local/boost_1_63_0  ^
-      -DBOOST_LIBRARYDIR=C:/local/boost_1_63_0/lib64-msvc-14.0 ^
-      ..
-cmake --build . --config Debug
-```
-
-To get the latest build instructions, you can reference 
[ci/appveyor-built.bat][5], which is used by automated Appveyor builds.
-
-## Replicating Appveyor Builds
-
-For people more familiar with linux development but need to replicate a 
failing appveyor build, here are some rough notes from
-replicating the Static_Crt_Builds (make unittest will probably still fail but 
many unit tests can be made with there individual
-make targets).
-
-1.  Microsoft offers trial VMs for [Windows with Microsoft Visual Studio][6].  
Download and install a version.
-2.  Run the VM and install CMake and Miniconda or Anaconda (these instructions 
assume Anaconda).
-3.  Download boost from [6] and install it (run from command prompt opened by 
"Developer Command Prompt for MSVC 2017"):
-
-```shell
-cd $EXTRACT_BOOST_DIRECTORY
-.\bootstrap.bat
-@rem This is for static libraries needed for static_crt_build in appvyor
-.\b2 link=static -with-filesystem -with-regex -with-system install
-@rem this should put libraries and headers in c:\Boost
-```
-
-4. Activate ananaconda/miniconda:
-
-```
-@rem this might differ for miniconda
-C:\Users\User\Anaconda3\Scripts\activate
-```
-
-5. Clone and change directories to the arrow source code (you might need to 
install git).
-6. Setup environment variables:
-
-```shell
-@rem Change the build type based on which appveyor job you want.
-SET JOB=Static_Crt_Build
-SET GENERATOR=Ninja
-SET APPVEYOR_BUILD_WORKER_IMAGE=Visual Studio 2017
-SET USE_CLCACHE=false
-SET ARROW_BUILD_GANDIVA=OFF
-SET ARROW_LLVM_VERSION=7.0.*
-SET PYTHON=3.6
-SET ARCH=64
-SET ARROW_BUILD_TOOLCHAIN=%CONDA_PREFIX%\Library
-SET 
PATH=C:\Users\User\Anaconda3;C:\Users\User\Anaconda3\Scripts;C:\Users\User\Anaconda3\Library\bin;%PATH%
-SET BOOST_LIBRARYDIR=C:\Boost\lib
-SET BOOST_ROOT=C:\Boost
-```
-7. Run appveyor scripts:
-
-```shell
-.\ci\appveyor-install.bat
-@rem this might fail but at this point most unit tests should be buildable by 
there individual targets
-@rem see next line for example.
-.\ci\appveyor-build.bat
-cmake --build . --config Release --target arrow-compute-hash-test
-```
-
-[1]: https://conda.io/miniconda.html
-[2]: https://conda-forge.github.io/
-[3]: http://cmder.net/
-[4]: https://cmake.org/
-[5]: https://github.com/apache/arrow/blob/master/ci/appveyor-cpp-build.bat
-[6]: https://developer.microsoft.com/en-us/windows/downloads/virtual-machines 
diff --git a/cpp/apidoc/index.md b/cpp/apidoc/index.md
deleted file mode 100644
index 076c297..0000000
--- a/cpp/apidoc/index.md
+++ /dev/null
@@ -1,42 +0,0 @@
-Apache Arrow C++ API documentation      {#index}
-==================================
-
-<!---
-  Licensed to the Apache Software Foundation (ASF) under one
-  or more contributor license agreements.  See the NOTICE file
-  distributed with this work for additional information
-  regarding copyright ownership.  The ASF licenses this file
-  to you under the Apache License, Version 2.0 (the
-  "License"); you may not use this file except in compliance
-  with the License.  You may obtain a copy of the License at
-
-    http://www.apache.org/licenses/LICENSE-2.0
-
-  Unless required by applicable law or agreed to in writing,
-  software distributed under the License is distributed on an
-  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-  KIND, either express or implied.  See the License for the
-  specific language governing permissions and limitations
-  under the License.
--->
-
-Apache Arrow is a columnar in-memory analytics layer designed to accelerate
-big data. It houses a set of canonical in-memory representations of flat and
-hierarchical data along with multiple language-bindings for structure
-manipulation. It also provides IPC and common algorithm implementations.
-
-This is the documentation of the C++ API of Apache Arrow. For more details
-on the format and other language bindings see
-the [main page for Arrow](https://arrow.apache.org/). Here will we only detail
-the usage of the C++ API for Arrow and the leaf libraries that add additional
-functionality such as using [jemalloc](http://jemalloc.net/) as an allocator
-for Arrow structures.
-
-Table of Contents
------------------
-
- * Instructions on how to build Arrow C++ on [Windows](Windows.md)
- * How to access [HDFS](HDFS.md)
- * Tutorials
-   * [Using the Plasma In-Memory Object Store](tutorials/plasma.md)
-   * [Use Plasma to Access Tensors from C++ in 
Python](tutorials/tensor_to_py.md)
diff --git a/cpp/thirdparty/README.md b/cpp/thirdparty/README.md
index 9be3361..9518cd4 100644
--- a/cpp/thirdparty/README.md
+++ b/cpp/thirdparty/README.md
@@ -19,91 +19,7 @@
 
 # Arrow C++ Thirdparty Dependencies
 
-The version numbers for our third-party dependencies are listed in
-`thirdparty/versions.txt`. This is used by the CMake build system as well as
-the dependency downloader script (see below), which can be used to set up
-offline builds.
+See the "Build Dependency Management" section in the [C++ Developer
+Documentation][1].
 
-## Configuring your own build toolchain
-
-To set up your own specific build toolchain, here are the relevant environment
-variables
-
-* brotli: `BROTLI_HOME`, can be disabled with `-DARROW_WITH_BROTLI=off`
-* Boost: `BOOST_ROOT`
-* double-conversion: `DOUBLE_CONVERSION_HOME`
-* Googletest: `GTEST_HOME` (only required to build the unit tests)
-* gflags: `GFLAGS_HOME` (only required to build the unit tests)
-* glog: `GLOG_HOME` (only required if `ARROW_USE_GLOG=ON`)
-* Google Benchmark: `GBENCHMARK_HOME` (only required if building benchmarks)
-* Flatbuffers: `FLATBUFFERS_HOME` (only required for -DARROW_IPC=on, which is
-  the default)
-* Hadoop: `HADOOP_HOME` (only required for the HDFS I/O extensions)
-* jemalloc: `JEMALLOC_HOME`
-* lz4: `LZ4_HOME`, can be disabled with `-DARROW_WITH_LZ4=off`
-* Apache ORC: `ORC_HOME`
-* protobuf: `PROTOBUF_HOME`
-* rapidjson: `RAPIDJSON_HOME`
-* re2: `RE2_HOME` (only required to build Gandiva currently)
-* snappy: `SNAPPY_HOME`, can be disabled with `-DARROW_WITH_SNAPPY=off`
-* thrift: `THRIFT_HOME`
-* zlib: `ZLIB_HOME`, can be disabled with `-DARROW_WITH_ZLIB=off`
-* zstd: `ZSTD_HOME`, can be disabled with `-DARROW_WITH_ZSTD=off`
-
-If you have all of your toolchain libraries installed at the same prefix, you
-can use the environment variable `$ARROW_BUILD_TOOLCHAIN` to automatically set
-all of these variables. Note that `ARROW_BUILD_TOOLCHAIN` will not set
-`BOOST_ROOT`, so if you have custom Boost installation, you must set this
-environment variable separately.
-
-## Configuring for offline builds
-
-If you do not use the above variables to direct the Arrow build system to
-preinstalled dependencies, they will be built automatically by the build
-system. The source archive for each dependency will be downloaded via the
-internet, which can cause issues in environments with limited access to the
-internet.
-
-To enable offline builds, you can download the source artifacts yourself and
-use environment variables of the form `ARROW_$LIBRARY_URL` to direct the build
-system to read from a local file rather than accessing the internet.
-
-To make this easier for you, we have prepared a script
-`thirdparty/download_dependencies.sh` which will download the correct version
-of each dependency to a directory of your choosing. It will print a list of
-bash-style environment variable statements at the end to use for your build
-script:
-
-```shell
-# Download tarballs into `$HOME/arrow-thirdparty-deps`
-$ ./thirdparty/download_dependencies $HOME/arrow-thirdparty
-# Environment variables for offline Arrow build
-export ARROW_BOOST_URL=$HOME/arrow-thirdparty/boost-1.67.0.tar.gz
-export ARROW_BROTLI_URL=$HOME/arrow-thirdparty/brotli-v0.6.0.tar.gz
-export 
ARROW_DOUBLE_CONVERSION_URL=$HOME/arrow-thirdparty/double-conversion-v3.1.1.tar.gz
-export 
ARROW_FLATBUFFERS_URL=$HOME/arrow-thirdparty/flatbuffers-02a7807dd8d26f5668ffbbec0360dc107bbfabd5.tar.gz
-export ARROW_GBENCHMARK_URL=$HOME/arrow-thirdparty/gbenchmark-v1.4.1.tar.gz
-export ARROW_GFLAGS_URL=$HOME/arrow-thirdparty/gflags-v2.2.0.tar.gz
-export ARROW_GLOG_URL=$HOME/arrow-thirdparty/glog-v0.3.5.tar.gz
-export ARROW_GRPC_URL=$HOME/arrow-thirdparty/grpc-v1.14.1.tar.gz
-export ARROW_GTEST_URL=$HOME/arrow-thirdparty/gtest-1.8.0.tar.gz
-export ARROW_LZ4_URL=$HOME/arrow-thirdparty/lz4-v1.7.5.tar.gz
-export ARROW_ORC_URL=$HOME/arrow-thirdparty/orc-1.5.4.tar.gz
-export ARROW_PROTOBUF_URL=$HOME/arrow-thirdparty/protobuf-v3.6.1.tar.gz
-export ARROW_RAPIDJSON_URL=$HOME/arrow-thirdparty/rapidjson-v1.1.0.tar.gz
-export ARROW_RE2_URL=$HOME/arrow-thirdparty/re2-2018-10-01.tar.gz
-export ARROW_SNAPPY_URL=$HOME/arrow-thirdparty/snappy-1.1.3.tar.gz
-export ARROW_THRIFT_URL=$HOME/arrow-thirdparty/thrift-0.11.0.tar.gz
-export ARROW_ZLIB_URL=$HOME/arrow-thirdparty/zlib-1.2.8.tar.gz
-export ARROW_ZSTD_URL=$HOME/arrow-thirdparty/zstd-v1.3.7.tar.gz
-```
-
-This can be automated by using inline source/eval:
-
-```shell
-$ source <(./thirdparty/download_dependencies $HOME/arrow-thirdparty-deps)
-```
-
-You can then invoke CMake to create the build directory and it will use the
-declared environment variable pointing to downloaded archives instead of
-downloading them (one for each build dir!).
+[1]: https://github.com/apache/arrow/blob/master/docs/source/developers/cpp.rst
\ No newline at end of file
diff --git a/docs/README.md b/docs/README.md
index 4430d65..aa0a231 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -26,5 +26,5 @@ Instructions for building the documentation site are found in
 [docs/source/building.rst][1]. The build depends on the API
 documentation for some of the project subcomponents.
 
-[1]: https://github.com/apache/arrow/blob/master/docs/source/building.rst
+[1]: 
https://github.com/apache/arrow/blob/master/docs/source/developers/documentation.rst
 [2]: https://github.com/apache/arrow/tree/master/docs/source/format
\ No newline at end of file
diff --git a/docs/source/developers/contributing.rst 
b/docs/source/developers/contributing.rst
new file mode 100644
index 0000000..326bdda
--- /dev/null
+++ b/docs/source/developers/contributing.rst
@@ -0,0 +1,88 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+.. _contributing:
+
+***********************
+Contribution Guidelines
+***********************
+
+There are many ways to contribute to Apache Arrow:
+
+* Contributing code (we call them "patches")
+* Writing documentation (another form of code, in a way)
+* Participating in discussions on JIRA or the mailing list
+* Helping users of the libraries
+* Reporting bugs and asking questions
+
+Mailing Lists and Issue Tracker
+===============================
+
+Projects in The Apache Software Foundation ("the ASF") use public, archived
+mailing lists to create a public record of each project's development
+activities and decision making process. As such, all contributors generally
+must be subscribed to the [email protected] mailing list to participate in
+the community.
+
+Note that you must be subscribed to the mailing list in order to post to it. To
+subscribe, send a blank email to [email protected].
+
+We use the `ASF JIRA <https://issues.apache.org/jira>`_ to manage our
+development "todo" list and to maintain changelogs for releases. You must
+create an account and be added as a "Contributor" to Apache Arrow to be able to
+assign yourself issues. Any project maintainer will be able to help you with
+this one-time setup.
+
+GitHub issues
+-------------
+
+We support GitHub issues as a lightweight way to ask questions and engage with
+the Arrow developer community. We use JIRA for maintaining a queue of
+development work and as the public record for work on the project. So, feel
+free to open GitHub issues, but bugs and feature requests will eventually need
+to end up in JIRA, either before or after completing a pull request. Don't be
+surprised if you are immediately asked by a project maintainer to open a JIRA
+issue.
+
+How to contribute patches
+=========================
+
+We prefer to receive contributions in the form of GitHub pull requests. Please
+send pull requests against the `github.com/apache/arrow
+<https://github.com/apache/arrow>`_ repository following the procedure below.
+
+If you are looking for some ideas on what to contribute, check out the JIRA
+issues for the Apache Arrow project. Comment on the issue and/or contact
[email protected] with your questions and ideas.
+
+If you’d like to report a bug but don’t have time to fix it, you can still post
+it on JIRA, or email the mailing list [email protected].
+
+To contribute a patch:
+
+* Break your work into small, single-purpose patches if possible. It’s much
+  harder to merge in a large change with a lot of disjoint features.
+* Create a JIRA for your patch on the Arrow Project JIRA.
+* Submit the patch as a GitHub pull request against the master branch. For a
+  tutorial, see the GitHub guides on forking a repo and sending a pull
+  request. Prefix your pull request name with the JIRA name (ex:
+  https://github.com/apache/arrow/pull/240).
+* Make sure that your code passes the unit tests. You can find instructions how
+  to run the unit tests for each Arrow component in its respective README file.
+* Add new unit tests for your code.
+
+Thank you in advance for your contributions!
diff --git a/docs/source/developers/cpp.rst b/docs/source/developers/cpp.rst
new file mode 100644
index 0000000..1e332dc
--- /dev/null
+++ b/docs/source/developers/cpp.rst
@@ -0,0 +1,913 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+.. _cpp-development:
+
+***************
+C++ Development
+***************
+
+System setup
+============
+
+Arrow uses CMake as a build configuration system. We recommend building
+out-of-source. If you are not familiar with this terminology:
+
+* **In-source build**: ``cmake`` is invoked directly from the ``cpp``
+  directory. This can be inflexible when you wish to maintain multiple build
+  environments (e.g. one for debug builds and another for release builds)
+* **Out-of-source build**: ``cmake`` is invoked from another directory,
+  creating an isolated build environment that does not interact with any other
+  build environment. For example, you could create ``cpp/build-debug`` and
+  invoke ``cmake $CMAKE_ARGS ..`` from this directory
+
+Building requires:
+
+* A C++11-enabled compiler. On Linux, gcc 4.8 and higher should be
+  sufficient. For Windows, at least Visual Studio 2015 is required.
+* CMake 3.2 or higher
+* Boost
+* ``bison`` and ``flex`` (for building Apache Thrift from source only, an
+  Apache Parquet dependency.)
+
+Running the unit tests using ``ctest`` requires:
+
+* python
+
+On Ubuntu/Debian you can install the requirements with:
+
+.. code-block:: shell
+
+   sudo apt-get install \
+        autoconf \
+        build-essential \
+        cmake \
+        libboost-dev \
+        libboost-filesystem-dev \
+        libboost-regex-dev \
+        libboost-system-dev \
+        python \
+        bison \
+        flex
+
+On Alpine Linux:
+
+.. code-block:: shell
+
+   apk add autoconf \
+           bash \
+           boost-dev \
+           cmake \
+           g++ \
+           gcc \
+           make
+
+On macOS, you can use `Homebrew <https://brew.sh/>`_.
+
+.. code-block:: shell
+
+   git clone https://github.com/apache/arrow.git
+   cd arrow
+   brew update && brew bundle --file=c_glib/Brewfile
+
+Building
+========
+
+The build system uses ``CMAKE_BUILD_TYPE=release`` by default, so if this
+argument is omitted then a release build will be produced.
+
+Minimal release build:
+
+.. code-block:: shell
+
+   git clone https://github.com/apache/arrow.git
+   cd arrow/cpp
+   mkdir release
+   cd release
+   cmake -DARROW_BUILD_TESTS=ON  ..
+   make unittest
+
+Minimal debug build:
+
+.. code-block:: shell
+
+   git clone https://github.com/apache/arrow.git
+   cd arrow/cpp
+   mkdir debug
+   cd debug
+   cmake -DCMAKE_BUILD_TYPE=Debug -DARROW_BUILD_TESTS=ON ..
+   make unittest
+
+If you do not need to build the test suite, you can omit the
+``ARROW_BUILD_TESTS`` option (the default is not to build the unit tests).
+
+On some Linux distributions, running the test suite might require setting an
+explicit locale. If you see any locale-related errors, try setting the
+environment variable (which requires the `locales` package or equivalent):
+
+.. code-block:: shell
+
+   export LC_ALL="en_US.UTF-8"
+
+Faster builds with Ninja
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Many contributors use the `Ninja build system <https://ninja-build.org/>`_ to
+get faster builds. It especially speeds up incremental builds. To use
+``ninja``, pass ``-GNinja`` when calling ``cmake`` and then use the ``ninja``
+command instead of ``make``.
+
+Optional Components
+~~~~~~~~~~~~~~~~~~~
+
+By default, the C++ build system creates a fairly minimal build. We have
+several optional system components which you can opt into building by passing
+boolean flags to ``cmake``.
+
+* ``-DARROW_CUDA=ON``: CUDA integration for GPU development. Depends on NVIDIA
+  CUDA toolkit. The CUDA toolchain used to build the library can be customized
+  by using the ``$CUDA_HOME`` environment variable.
+* ``-DARROW_FLIGHT=ON``: Arrow Flight RPC system, which depends at least on
+  gRPC
+* ``-DARROW_GANDIVA=ON``: Gandiva expression compiler, depends on LLVM,
+  Protocol Buffers, and re2
+* ``-DARROW_HDFS=ON``: Arrow integration with libhdfs for accessing the Hadoop
+  Filesystem
+* ``-DARROW_HIVESERVER2=ON``: Client library for HiveServer2 database protocol
+* ``-DARROW_ORC=ON``: Arrow integration with Apache ORC
+* ``-DARROW_PARQUET=ON``: Apache Parquet libraries and Arrow integration
+* ``-DARROW_PLASMA=ON``: Plasma Shared Memory Object Store
+* ``-DARROW_PLASMA_JAVA_CLIENT=ON``: Build Java client for Plasma
+* ``-DARROW_PYTHON=ON``: Arrow Python C++ integration library (required for
+  building pyarrow). This library must be built against the same Python version
+  for which you are building pyarrow, e.g. Python 2.7 or Python 3.6. NumPy must
+  also be installed.
+
+Some features of the core Arrow shared library can be switched off for improved
+build times if they are not required for your application:
+
+* ``-DARROW_COMPUTE=ON``: build the in-memory analytics module
+* ``-DARROW_IPC=ON``: build the IPC extensions (requiring Flatbuffers)
+
+CMake version requirements
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+While we support CMake 3.2 and higher, some features require a newer version of
+CMake:
+
+* Building the benchmarks requires 3.6 or higher
+* Building zstd from source requires 3.7 or higher
+* Building Gandiva JNI bindings requires 3.11 or higher
+
+LLVM and Clang Tools
+~~~~~~~~~~~~~~~~~~~~
+
+We are currently using LLVM 7 for library builds and for other developer tools
+such as code formatting with ``clang-format``. LLVM can be installed via most
+modern package managers (apt, yum, conda, Homebrew, chocolatey).
+
+Build Dependency Management
+===========================
+
+The build system supports a number of third-party dependencies
+
+  * ``BOOST``: for cross-platform support
+  * ``BROTLI``: for data compression
+  * ``double-conversion``: for text-to-numeric conversions
+  * ``Snappy``: for data compression
+  * ``gflags``: for command line utilities (formerly Googleflags)
+  * ``glog``: for logging
+  * ``Thrift``: Apache Thrift, for data serialization
+  * ``Protobuf``: Google Protocol Buffers, for data serialization
+  * ``GTEST``: Googletest, for testing
+  * ``benchmark``: Google benchmark, for testing
+  * ``RapidJSON``: for data serialization
+  * ``Flatbuffers``: for data serialization
+  * ``ZLIB``: for data compression
+  * ``BZip2``: for data compression
+  * ``LZ4``: for data compression
+  * ``ZSTD``: for data compression
+  * ``RE2``: for regular expressions
+  * ``gRPC``: for remote procedure calls
+  * ``c-ares``: a dependency of gRPC
+  * ``LLVM``: a dependency of Gandiva
+
+The CMake option ``ARROW_DEPENDENCY_SOURCE`` is a global option that instructs
+the build system how to resolve each dependency. There are a few options:
+
+* ``AUTO``: try to find package in the system default locations and build from
+  source if not found
+* ``BUNDLED``: Building the dependency automatically from source
+* ``SYSTEM``: Finding the dependency in system paths using CMake's built-in
+  ``find_package`` function, or using ``pkg-config`` for packages that do not
+  have this feature
+* ``BREW``: Use Homebrew default paths as an alternative ``SYSTEM`` path
+* ``CONDA``: Use ``$CONDA_PREFIX`` as alternative ``SYSTEM`` PATH
+
+The default method is ``AUTO`` unless you are developing within an active conda
+environment (detected by presence of the ``$CONDA_PREFIX`` environment
+variable), in which case it is ``CONDA``.
+
+Individual Dependency Resolution
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+While ``-DARROW_DEPENDENCY_SOURCE=$SOURCE`` sets a global default for all
+packages, the resolution strategy can be overridden for individual packages by
+setting ``-D$PACKAGE_NAME_SOURCE=..``. For example, to build Protocol Buffers
+from source, set
+
+.. code-block:: shell
+
+   -DProtobuf_SOURCE=BUNDLED
+
+This variable is unfortunately case-sensitive; the name used for each package
+is listed above, but the most up-to-date listing can be found in
+`cpp/cmake_modules/ThirdpartyToolchain.cmake 
<https://github.com/apache/arrow/blob/master/cpp/cmake_modules/ThirdpartyToolchain.cmake>`_.
+
+Bundled Dependency Versions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When using the ``BUNDLED`` method to build a dependency from source, the
+version number from ``cpp/thirdparty/versions.txt`` is used. There is also a
+dependency source downloader script (see below), which can be used to set up
+offline builds.
+
+Boost-related Options
+~~~~~~~~~~~~~~~~~~~~~
+
+We depend on some Boost C++ libraries for cross-platform suport. In most cases,
+the Boost version available in your package manager may be new enough, and the
+build system will find it automatically. If you have Boost installed in a
+non-standard location, you can specify it by passing
+``-DBOOST_ROOT=$MY_BOOST_ROOT`` or setting the ``BOOST_ROOT`` environment
+variable.
+
+Unlike most of the other dependencies, if Boost is not found by the build
+system it will not be built automatically from source. To opt-in to a vendored
+Boost build, pass ``-DARROW_BOOST_VENDORED=ON``. This automatically sets the
+option ``-DARROW_BOOST_USE_SHARED=OFF`` to statically-link Boost into the
+produced libraries and executables.
+
+Offline Builds
+~~~~~~~~~~~~~~
+
+If you do not use the above variables to direct the Arrow build system to
+preinstalled dependencies, they will be built automatically by the Arrow build
+system. The source archive for each dependency will be downloaded via the
+internet, which can cause issues in environments with limited access to the
+internet.
+
+To enable offline builds, you can download the source artifacts yourself and
+use environment variables of the form ``ARROW_$LIBRARY_URL`` to direct the
+build system to read from a local file rather than accessing the internet.
+
+To make this easier for you, we have prepared a script
+``thirdparty/download_dependencies.sh`` which will download the correct version
+of each dependency to a directory of your choosing. It will print a list of
+bash-style environment variable statements at the end to use for your build
+script.
+
+.. code-block:: shell
+
+   # Download tarballs into $HOME/arrow-thirdparty
+   $ ./thirdparty/download_dependencies.sh $HOME/arrow-thirdparty
+
+You can then invoke CMake to create the build directory and it will use the
+declared environment variable pointing to downloaded archives instead of
+downloading them (one for each build dir!).
+
+General C++ Development
+=======================
+
+This section provides information for developers who wish to contribute to the
+C++ codebase.
+
+.. note::
+
+   Since most of the project's developers work on Linux or macOS, not all
+   features or developer tools are uniformly supported on Windows. If you are
+   on Windows, have a look at the later section on Windows development.
+
+Code Style, Linting, and CI
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This project follows `Google's C++ Style Guide
+<https://google.github.io/styleguide/cppguide.html>`_ with minor exceptions:
+
+* We relax the line length restriction to 90 characters.
+* We use doxygen style comments ("///") in header files for comments that we
+  wish to show up in API documentation
+* We use the ``NULLPTR`` macro in header files (instead of ``nullptr``) defined
+  in ``src/arrow/util/macros.h`` to support building C++/CLI (ARROW-1134)
+
+Our continuous integration builds in Travis CI and Appveyor run the unit test
+suites on a variety of platforms and configuration, including using
+``valgrind`` to check for memory leaks or bad memory accesses. In addition, the
+codebase is subjected to a number of code style and code cleanliness checks.
+
+In order to have a passing CI build, your modified git branch must pass the
+following checks:
+
+* C++ builds without compiler warnings with ``-DBUILD_WARNING_LEVEL=CHECKIN``
+* C++ unit test suite with valgrind enabled, use ``-DARROW_TEST_MEMCHECK=ON``
+  when invoking CMake
+* Passes cpplint checks, checked with ``make lint``
+* Conforms to ``clang-format`` style, checked with ``make check-format``
+* Passes C++/CLI header file checks, invoked with
+  ``cpp/build-support/lint_cpp_cli.py cpp/src``
+* CMake files pass style checks, can be fixed by running
+  ``run-cmake-format.py`` from the root of the repository. This requires Python
+  3 and `cmake_format <https://github.com/cheshirekow/cmake_format>`_ (note:
+  this currently does not work on Windows)
+
+In order to account for variations in the behavior of ``clang-format`` between
+major versions of LLVM, we pin the version of ``clang-format`` used (current
+LLVM 7).
+
+Depending on how you installed clang-format, the build system may not be able
+to find it. You can provide an explicit path to your LLVM installation (or the
+root path for the clang tools) with the environment variable
+`$CLANG_TOOLS_PATH` or by passing ``-DClangTools_PATH=$PATH_TO_CLANG_TOOLS`` 
when
+invoking CMake.
+
+To make linting more reproducible for everyone, we provide a ``docker-compose``
+target that is executable from the root of the repository:
+
+.. code-block:: shell
+
+   docker-compose run lint
+
+See :ref:`integration` for more information about the project's
+``docker-compose`` configuration.
+
+Modular Build Targets
+~~~~~~~~~~~~~~~~~~~~~
+
+Since there are several major parts of the C++ project, we have provided
+modular CMake targets for building each library component, group of unit tests
+and benchmarks, and their dependencies:
+
+* ``make arrow`` for Arrow core libraries
+* ``make parquet`` for Parquet libraries
+* ``make gandiva`` for Gandiva (LLVM expression compiler) libraries
+* ``make plasma`` for Plasma libraries, server
+
+To build the unit tests or benchmarks, add ``-tests`` or ``-benchmarks`` to the
+target name. So ``make arrow-tests`` will build the Arrow core unit
+tests. Using the ``-all`` target, e.g. ``parquet-all``, will build everything.
+
+If you wish to only build and install one or more project subcomponents, we
+have provided the CMake option ``ARROW_OPTIONAL_INSTALL`` to only install
+targets that have been built. For example, if you only wish to build the
+Parquet libraries, its tests, and its dependencies, you can run:
+
+.. code-block:: shell
+
+   cmake .. -DARROW_PARQUET=ON \
+         -DARROW_OPTIONAL_INSTALL=ON \
+         -DARROW_BUILD_TESTS=ON
+   make parquet
+   make install
+
+If you omit an explicit target when invoking ``make``, all targets will be
+built.
+
+Building API Documentation
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+While we publish the API documentation as part of the main Sphinx-based
+documentation site, you can also build the C++ API documentation anytime using
+Doxygen. Run the following command from the ``cpp/apidoc`` directory:
+
+.. code-block:: shell
+
+   doxygen Doxyfile
+
+This requires `Doxygen <https://www.doxygen.org>`_ to be installed.
+
+Benchmarking
+~~~~~~~~~~~~
+
+Follow the directions for simple build except run cmake with the
+``ARROW_BUILD_BENCHMARKS`` parameter set to ``ON``:
+
+.. code-block:: shell
+
+    cmake -DARROW_BUILD_TESTS=ON -DARROW_BUILD_BENCHMARKS=ON ..
+
+and instead of make unittest run either ``make; ctest`` to run both unit tests
+and benchmarks or ``make benchmark`` to run only the benchmarks. Benchmark logs
+will be placed in the build directory under ``build/benchmark-logs``.
+
+You can also invoke a single benchmark executable directly:
+
+.. code-block:: shell
+
+   ./release/arrow-builder-benchmark
+
+The build system uses ``CMAKE_BUILD_TYPE=release`` by default which enables
+compiler optimizations. It is also recommended to disable CPU throttling or
+such hardware features as "Turbo Boost" to obtain more consistent and
+comparable. benchmark results
+
+Testing with LLVM AddressSanitizer
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To use AddressSanitizer (ASAN) to find bad memory accesses or leaks with LLVM,
+pass ``-DARROW_USE_ASAN=ON`` when building. You must use clang to compile with
+ASAN, and ``ARROW_USE_ASAN`` is mutually-exclusive with the valgrind option
+``ARROW_TEST_MEMCHECK``.
+
+Fuzz testing with libfuzzer
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Fuzzers can help finding unhandled exceptions and problems with untrusted input
+that may lead to crashes, security issues and undefined behavior. They do this
+by generating random input data and observing the behavior of the executed
+code. To build the fuzzer code, LLVM is required (GCC-based compilers won't
+work). You can build them using the following code:
+
+.. code-block:: shell
+
+   cmake -DARROW_FUZZING=ON -DARROW_USE_ASAN=ON ..
+
+``ARROW_FUZZING`` will enable building of fuzzer executables as well as enable 
the
+addition of coverage helpers via ``ARROW_USE_COVERAGE``, so that the fuzzer 
can observe
+the program execution.
+
+It is also wise to enable some sanitizers like ``ARROW_USE_ASAN`` (see above), 
which
+activates the address sanitizer. This way, we ensure that bad memory operations
+provoked by the fuzzer will be found early. You may also enable other 
sanitizers as
+well. Just keep in mind that some of them do not work together and some may 
result
+in very long execution times, which will slow down the fuzzing procedure.
+
+Now you can start one of the fuzzer, e.g.:
+
+.. code-block:: shell
+
+   ./debug/debug/ipc-fuzzing-test
+
+This will try to find a malformed input that crashes the payload and will show 
the
+stack trace as well as the input data. After a problem was found this way, it 
should
+be reported and fixed. Usually, the fuzzing process cannot be continued until 
the
+fix is applied, since the fuzzer usually converts to the problem again.
+
+If you build fuzzers with ASAN, you need to set the ``ASAN_SYMBOLIZER_PATH``
+environment variable to the absolute path of ``llvm-symbolizer``, which is a 
tool
+that ships with LLVM.
+
+.. code-block:: shell
+
+   export ASAN_SYMBOLIZER_PATH=$(type -p llvm-symbolizer)
+
+Note that some fuzzer builds currently reject paths with a version qualifier
+(like ``llvm-sanitizer-5.0``). To overcome this, set an appropriate symlink
+(here, when using LLVM 5.0):
+
+.. code-block:: shell
+
+   ln -sf /usr/bin/llvm-sanitizer-5.0 /usr/bin/llvm-sanitizer
+
+There are some problems that may occur during the compilation process:
+
+- libfuzzer was not distributed with your LLVM: ``ld: file not found: 
.../libLLVMFuzzer.a``
+- your LLVM is too old: ``clang: error: unsupported argument 'fuzzer' to 
option 'fsanitize='``
+
+Extra debugging help
+~~~~~~~~~~~~~~~~~~~~
+
+If you use the CMake option ``-DARROW_EXTRA_ERROR_CONTEXT=ON`` it will compile
+the libraries with extra debugging information on error checks inside the
+``RETURN_NOT_OK`` macro. In unit tests with ``ASSERT_OK``, this will yield 
error
+outputs like:
+
+.. code-block:: shell
+
+   ../src/arrow/ipc/ipc-read-write-test.cc:609: Failure
+   Failed
+   ../src/arrow/ipc/metadata-internal.cc:508 code: TypeToFlatbuffer(fbb, 
*field.type(), &children, &layout, &type_enum, dictionary_memo, &type_offset)
+   ../src/arrow/ipc/metadata-internal.cc:598 code: FieldToFlatbuffer(fbb, 
*schema.field(i), dictionary_memo, &offset)
+   ../src/arrow/ipc/metadata-internal.cc:651 code: SchemaToFlatbuffer(fbb, 
schema, dictionary_memo, &fb_schema)
+   ../src/arrow/ipc/writer.cc:697 code: WriteSchemaMessage(schema_, 
dictionary_memo_, &schema_fb)
+   ../src/arrow/ipc/writer.cc:730 code: WriteSchema()
+   ../src/arrow/ipc/writer.cc:755 code: schema_writer.Write(&dictionaries_)
+   ../src/arrow/ipc/writer.cc:778 code: CheckStarted()
+   ../src/arrow/ipc/ipc-read-write-test.cc:574 code: 
writer->WriteRecordBatch(batch)
+   NotImplemented: Unable to convert type: decimal(19, 4)
+
+Deprecations and API Changes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+We use the compiler definition ``ARROW_NO_DEPRECATED_API`` to disable APIs that
+have been deprecated. It is a good practice to compile third party applications
+with this flag to proactively catch and account for API changes.
+
+Cleaning includes with include-what-you-use (IWYU)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+We occasionally use Google's `include-what-you-use
+<https://github.com/include-what-you-use/include-what-you-use>`_ tool, also
+known as IWYU, to remove unnecessary imports. Since setting up IWYU can be a
+bit tedious, we provide a ``docker-compose`` target for running it on the C++
+codebase:
+
+.. code-block:: shell
+
+   make -f Makefile.docker build-iwyu
+   docker-compose run lint
+
+Checking for ABI and API stability
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To build ABI compliance reports, you need to install the two tools
+``abi-dumper`` and ``abi-compliance-checker``.
+
+Build Arrow C++ in Debug mode, alternatively you could use ``-Og`` which also
+builds with the necessary symbols but includes a bit of code optimization.
+Once the build has finished, you can generate ABI reports using:
+
+.. code-block:: shell
+
+   abi-dumper -lver 9 debug/libarrow.so -o ABI-9.dump
+
+The above version number is freely selectable. As we want to compare versions,
+you should now ``git checkout`` the version you want to compare it to and 
re-run
+the above command using a different version number. Once both reports are
+generated, you can build a comparision report using
+
+.. code-block:: shell
+
+   abi-compliance-checker -l libarrow -d1 ABI-PY-9.dump -d2 ABI-PY-10.dump
+
+The report is then generated in ``compat_reports/libarrow`` as a HTML.
+
+Developing on Windows
+=====================
+
+Like Linux and macOS, we have worked to enable builds to work "out of the box"
+with CMake for a reasonably large subset of the project.
+
+System Setup
+~~~~~~~~~~~~
+
+Microsoft provides the free Visual Studio Community edition. When doing
+development in teh the shell, you must initialize the development
+environment.
+
+For Visual Studio 2015, execute the following batch script:
+
+.. code-block:: shell
+
+   "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\vcvarsall.bat" amd64
+
+For Visual Studio 2017, the script is:
+
+.. code-block:: shell
+
+   "C:\Program Files (x86)\Microsoft Visual 
Studio\2017\Community\Common7\Tools\VsDevCmd.bat" -arch=amd64
+
+One can configure a console emulator like `cmder <https://cmder.net/>`_ to
+automatically launch this when starting a new development console.
+
+Using conda-forge for build dependencies
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+`Miniconda <https://conda.io/miniconda.html>`_ is a minimal Python distribution
+including the `conda <https://conda.io>`_ package manager. Some memers of the
+Apache Arrow community participate in the maintenance of `conda-forge
+<https://conda-forge.org/>`_, a community-maintained cross-platform package
+repository for conda.
+
+To use ``conda-forge`` for your C++ build dependencies on Windows, first
+download and install a 64-bit distribution from the `Miniconda homepage
+<https://conda.io/miniconda.html>`_
+
+To configure ``conda`` to use the ``conda-forge`` channel by default, launch a
+command prompt (``cmd.exe``) and run the command:
+
+.. code-block:: shell
+
+   conda config --add channels conda-forge
+
+Now, you can bootstrap a build environment (call from the root directory of the
+Arrow codebase):
+
+.. code-block:: shell
+
+   conda create -y -n arrow-dev --file=ci\conda_env_cpp.yml
+
+Then "activate" this conda environment with:
+
+.. code-block:: shell
+
+   activate arrow-dev
+
+If the environment has been activated, the Arrow build system will
+automatically see the ``%CONDA_PREFIX%`` environment variable and use that for
+resolving the build dependencies. This is equivalent to setting
+
+.. code-block:: shell
+
+   -DARROW_DEPENDENCY_SOURCE=SYSTEM ^
+   -DARROW_PACKAGE_PREFIX=%CONDA_PREFIX%\Library
+
+Note that these packages are only supported for release builds. If you intend
+to use ``-DCMAKE_BUILD_TYPE=debug`` then you must build the packages from
+source.
+
+.. note::
+
+   If you run into any problems using conda packages for dependencies, a very
+   common problem is mixing packages from the ``defaults`` channel with those
+   from ``conda-forge``. You can examine the installed packages in your
+   environment (and their origin) with ``conda list``
+
+Building using Visual Studio (MSVC) Solution Files
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Change working directory in ``cmd.exe`` to the root directory of Arrow and do
+an out of source build by generating a MSVC solution:
+
+.. code-block:: shell
+
+   cd cpp
+   mkdir build
+   cd build
+   cmake -G "Visual Studio 14 2015 Win64" ..
+   cmake --build . --config Release
+
+Building with Ninja and clcache
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The `Ninja <https://ninja-build.org/>`_ build system offsets better build
+parallelization, and the optional `clcache
+<https://github.com/frerich/clcache/>`_ compiler cache which keeps track of
+past compilations to avoid running them over and over again (in a way similar
+to the Unix-specific ``ccache``).
+
+Activate your conda build environment to first install those utilities:
+
+.. code-block:: shell
+
+   activate arrow-dev
+   conda install -c conda-forge ninja
+   pip install git+https://github.com/frerich/clcache.git
+
+Change working directory in ``cmd.exe`` to the root directory of Arrow and
+do an out of source build by generating Ninja files:
+
+.. code-block:: shell
+
+   cd cpp
+   mkdir build
+   cd build
+   cmake -G "Ninja" ..
+   cmake --build . --config Release
+
+Building with NMake
+~~~~~~~~~~~~~~~~~~~
+
+Change working directory in ``cmd.exe`` to the root directory of Arrow and
+do an out of source build using `nmake`:
+
+.. code-block:: shell
+
+   cd cpp
+   mkdir build
+   cd build
+   cmake -G "NMake Makefiles" ..
+   cmake --build . --config Release
+   nmake
+
+Debug builds
+~~~~~~~~~~~~
+
+To build Debug version of Arrow you should have pre-installed a Debug version
+of Boost. It's recommended to configure cmake build with the following
+variables for Debug build:
+
+* ``-DARROW_BOOST_USE_SHARED=OFF``: enables static linking with boost debug
+  libs and simplifies run-time loading of 3rd parties
+* ``-DBOOST_ROOT``: sets the root directory of boost libs. (Optional)
+* ``-DBOOST_LIBRARYDIR``: sets the directory with boost lib files. (Optional)
+
+The command line to build Arrow in Debug will look something like this:
+
+.. code-block:: shell
+
+   cd cpp
+   mkdir build
+   cd build
+   cmake .. -G "Visual Studio 14 2015 Win64" ^
+         -DARROW_BOOST_USE_SHARED=OFF ^
+         -DCMAKE_BUILD_TYPE=Debug ^
+         -DBOOST_ROOT=C:/local/boost_1_63_0  ^
+         -DBOOST_LIBRARYDIR=C:/local/boost_1_63_0/lib64-msvc-14.0
+   cmake --build . --config Debug
+
+Windows dependency resolution issues
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Because Windows uses `.lib` files for both static and dynamic linking of
+dependencies, the static library sometimes may be named something different
+like ``%PACKAGE%_static.lib`` to distinguish itself. If you are statically
+linking some dependencies, we provide some options
+
+* ``-DBROTLI_MSVC_STATIC_LIB_SUFFIX=%BROTLI_SUFFIX%``
+* ``-DSNAPPY_MSVC_STATIC_LIB_SUFFIX=%SNAPPY_SUFFIX%``
+* ``-LZ4_MSVC_STATIC_LIB_SUFFIX=%LZ4_SUFFIX%``
+* ``-ZSTD_MSVC_STATIC_LIB_SUFFIX=%ZSTD_SUFFIX%``
+
+To get the latest build instructions, you can reference `ci/appveyor-built.bat
+<https://github.com/apache/arrow/blob/master/ci/appveyor-cpp-build.bat>`_,
+which is used by automated Appveyor builds.
+
+Statically linking to Arrow on Windows
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The Arrow headers on Windows static library builds (enabled by the CMake
+option ``ARROW_BUILD_STATIC``) use the preprocessor macro ``ARROW_STATIC`` to
+suppress dllimport/dllexport marking of symbols. Projects that statically link
+against Arrow on Windows additionally need this definition. The Unix builds do
+not use the macro.
+
+Replicating Appveyor Builds
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+For people more familiar with linux development but need to replicate a failing
+appveyor build, here are some rough notes from replicating the
+``Static_Crt_Build`` (make unittest will probably still fail but many unit
+tests can be made with there individual make targets).
+
+1. Microsoft offers trial VMs for `Windows with Microsoft Visual Studio
+   
<https://developer.microsoft.com/en-us/windows/downloads/virtual-machines>`_.
+   Download and install a version.
+2. Run the VM and install CMake and Miniconda or Anaconda (these instructions
+   assume Anaconda).
+3. Download `pre-built Boost debug binaries
+   <https://sourceforge.net/projects/boost/files/boost-binaries/>`_ and install
+   it (run from command prompt opened by "Developer Command Prompt for MSVC
+   2017"):
+
+.. code-block:: shell
+
+   cd $EXTRACT_BOOST_DIRECTORY
+   .\bootstrap.bat
+   @rem This is for static libraries needed for static_crt_build in appvyor
+   .\b2 link=static -with-filesystem -with-regex -with-system install
+   @rem this should put libraries and headers in c:\Boost
+
+4. Activate ananaconda/miniconda:
+
+.. code-block:: shell
+
+  @rem this might differ for miniconda
+  C:\Users\User\Anaconda3\Scripts\activate
+
+5. Clone and change directories to the arrow source code (you might need to
+   install git).
+6. Setup environment variables:
+
+.. code-block:: shell
+
+   @rem Change the build type based on which appveyor job you want.
+   SET JOB=Static_Crt_Build
+   SET GENERATOR=Ninja
+   SET APPVEYOR_BUILD_WORKER_IMAGE=Visual Studio 2017
+   SET USE_CLCACHE=false
+   SET ARROW_BUILD_GANDIVA=OFF
+   SET ARROW_LLVM_VERSION=7.0.*
+   SET PYTHON=3.6
+   SET ARCH=64
+   SET ARROW_BUILD_TOOLCHAIN=%CONDA_PREFIX%\Library
+   SET 
PATH=C:\Users\User\Anaconda3;C:\Users\User\Anaconda3\Scripts;C:\Users\User\Anaconda3\Library\bin;%PATH%
+   SET BOOST_LIBRARYDIR=C:\Boost\lib
+   SET BOOST_ROOT=C:\Boost
+
+7. Run appveyor scripts:
+
+.. code-block:: shell
+
+   .\ci\appveyor-install.bat
+   @rem this might fail but at this point most unit tests should be buildable 
by there individual targets
+   @rem see next line for example.
+   .\ci\appveyor-build.bat
+   cmake --build . --config Release --target arrow-compute-hash-test
+
+Apache Parquet Development
+==========================
+
+To build the C++ libraries for Apache Parquet, add the flag
+``-DARROW_PARQUET=ON`` when invoking CMake. The Parquet libraries and unit 
tests
+can be built with the ``parquet`` make target:
+
+.. code-block:: shell
+
+   make parquet
+
+Running ``ctest -L unittest`` will run all built C++ unit tests, while ``ctest 
-L
+parquet`` will run only the Parquet unit tests. The unit tests depend on an
+environment variable ``PARQUET_TEST_DATA`` that depends on a git submodule to 
the
+repository https://github.com/apache/parquet-testing:
+
+.. code-block:: shell
+
+   git submodule update --init
+   export PARQUET_TEST_DATA=$ARROW_ROOT/cpp/submodules/parquet-testing/data
+
+Here ``$ARROW_ROOT`` is the absolute path to the Arrow codebase.
+
+Arrow Flight RPC
+================
+
+In addition to the Arrow dependencies, Flight requires:
+
+* gRPC (>= 1.14, roughly)
+* Protobuf (>= 3.6, earlier versions may work)
+* c-ares (used by gRPC)
+
+By default, Arrow will try to download and build these dependencies
+when building Flight.
+
+The optional ``flight`` libraries and tests can be built by passing
+``-DARROW_FLIGHT=ON``.
+
+.. code-block:: shell
+
+   cmake .. -DARROW_FLIGHT=ON -DARROW_BUILD_TESTS=ON
+   make
+
+You can also use existing installations of the extra dependencies.
+When building, set the environment variables ``gRPC_ROOT`` and/or
+``Protobuf_ROOT`` and/or ``c-ares_ROOT``.
+
+We are developing against recent versions of gRPC, and the versions. The
+``grpc-cpp`` package available from https://conda-forge.org/ is one reliable
+way to obtain gRPC in a cross-platform way. You may try using system libraries
+for gRPC and Protobuf, but these are likely to be too old. On macOS, you can
+try `Homebrew <https://brew.sh/>`_:
+
+.. code-block:: shell
+
+   brew install grpc
+
+Development Conventions
+=======================
+
+This section provides some information about some of the abstractions and
+development approaches we use to solve problems common to many parts of the C++
+project.
+
+Memory Pools
+~~~~~~~~~~~~
+
+We provide a default memory pool with ``arrow::default_memory_pool()``. As a
+matter of convenience, some of the array builder classes have constructors
+which use the default pool without explicitly passing it. You can disable these
+constructors in your application (so that you are accounting properly for all
+memory allocations) by defining ``ARROW_NO_DEFAULT_MEMORY_POOL``.
+
+Header files
+~~~~~~~~~~~~
+
+We use the ``.h`` extension for C++ header files. Any header file name not
+containing ``internal`` is considered to be a public header, and will be
+automatically installed by the build.
+
+Error Handling and Exceptions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+For error handling, we use ``arrow::Status`` values instead of throwing C++
+exceptions. Since the Arrow C++ libraries are intended to be useful as a
+component in larger C++ projects, using ``Status`` objects can help with good
+code hygiene by making explicit when a function is expected to be able to fail.
+
+For expressing invariants and "cannot fail" errors, we use DCHECK macros
+defined in ``arrow/util/logging.h``. These checks are disabled in release 
builds
+and are intended to catch internal development errors, particularly when
+refactoring. These macros are not to be included in any public header files.
+
+Since we do not use exceptions, we avoid doing expensive work in object
+constructors. Objects that are expensive to construct may often have private
+constructors, with public static factory methods that return ``Status``.
+
+There are a number of object constructors, like ``arrow::Schema`` and
+``arrow::RecordBatch`` where larger STL container objects like ``std::vector`` 
may
+be created. While it is possible for ``std::bad_alloc`` to be thrown in these
+constructors, the circumstances where they would are somewhat esoteric, and it
+is likely that an application would have encountered other more serious
+problems prior to having ``std::bad_alloc`` thrown in a constructor.
diff --git a/docs/source/developers/documentation.rst 
b/docs/source/developers/documentation.rst
index 1fbab43..452305b 100644
--- a/docs/source/developers/documentation.rst
+++ b/docs/source/developers/documentation.rst
@@ -65,7 +65,7 @@ These two steps are mandatory and must be executed in order.
 
       This step requires the the pyarrow library is installed
       in your python environment.  One way to accomplish
-      this is to follow the build instructions at :ref:`development`
+      this is to follow the build instructions at :ref:`python-development`
       and then run `python setup.py install` in arrow/python
       (it is best to do this in a dedicated conda/virtual environment).
 
diff --git a/docs/source/developers/index.rst b/docs/source/developers/index.rst
index e99f7c5..a58f969 100644
--- a/docs/source/developers/index.rst
+++ b/docs/source/developers/index.rst
@@ -15,11 +15,11 @@
 .. specific language governing permissions and limitations
 .. under the License.
 
-Developing Apache Arrow
-=======================
-
 .. toctree::
    :maxdepth: 2
 
+   contributing
+   cpp
+   python
    integration
    documentation
diff --git a/docs/source/developers/integration.rst 
b/docs/source/developers/integration.rst
index 83597fa..df56231 100644
--- a/docs/source/developers/integration.rst
+++ b/docs/source/developers/integration.rst
@@ -15,6 +15,8 @@
 .. specific language governing permissions and limitations
 .. under the License.
 
+.. _integration:
+
 Integration Testing
 ===================
 
diff --git a/docs/source/python/development.rst 
b/docs/source/developers/python.rst
similarity index 58%
rename from docs/source/python/development.rst
rename to docs/source/developers/python.rst
index 7a9e8cb..7a0b7f7 100644
--- a/docs/source/python/development.rst
+++ b/docs/source/developers/python.rst
@@ -16,13 +16,83 @@
 .. under the License.
 
 .. currentmodule:: pyarrow
-.. _development:
+.. _python-development:
 
-***********
-Development
-***********
+******************
+Python Development
+******************
 
-Developing on Linux and MacOS
+This page provides general Python development guidelines and source build
+instructions for all platforms.
+
+Coding Style
+============
+
+We follow a similar PEP8-like coding style to the `pandas project
+<https://github.com/pandas-dev/pandas>`_.
+
+The code must pass ``flake8`` (available from pip or conda) or it will fail the
+build. Check for style errors before submitting your pull request with:
+
+.. code-block:: shell
+
+   flake8 .
+   flake8 --config=.flake8.cython .
+
+Unit Testing
+============
+
+We are using `pytest <https://docs.pytest.org/en/latest/>`_ to develop our unit
+test suite. After building the project (see below) you can run its unit tests
+like so:
+
+.. code-block:: shell
+
+   pytest pyarrow
+
+Package requirements to run the unit tests are found in
+``requirements-test.txt`` and can be installed if needed with ``pip -r
+requirements-test.txt``.
+
+The project has a number of custom command line options for its test
+suite. Some tests are disabled by default, for example. To see all the options,
+run
+
+.. code-block:: shell
+
+   pytest pyarrow --help
+
+and look for the "custom options" section.
+
+Test Groups
+-----------
+
+We have many tests that are grouped together using pytest marks. Some of these
+are disabled by default. To enable a test group, pass ``--$GROUP_NAME``,
+e.g. ``--parquet``. To disable a test group, prepend ``disable``, so
+``--disable-parquet`` for example. To run **only** the unit tests for a
+particular group, prepend ``only-`` instead, for example ``--only-parquet``.
+
+The test groups currently include:
+
+* ``gandiva``: tests for Gandiva expression compiler (uses LLVM)
+* ``hdfs``: tests that use libhdfs or libhdfs3 to access the Hadoop filesystem
+* ``hypothesis``: tests that use the ``hypothesis`` module for generating
+  random test cases. Note that ``--hypothesis`` doesn't work due to a quirk
+  with pytest, so you have to pass ``--enable-hypothesis``
+* ``large_memory``: Test requiring a large amount of system RAM
+* ``orc``: Apache ORC tests
+* ``parquet``: Apache Parquet tests
+* ``plasma``: Plasma Object Store tests
+* ``s3``: Tests for Amazon S3
+* ``tensorflow``: Tests that involve TensorFlow
+
+Benchmarking
+------------
+
+For running the benchmarks, see :ref:`python-benchmarks`.
+
+Building on Linux and MacOS
 =============================
 
 System Requirements
@@ -31,25 +101,20 @@ System Requirements
 On macOS, any modern XCode (6.4 or higher; the current version is 8.3.1) is
 sufficient.
 
-On Linux, for this guide, we recommend using gcc 4.8 or 4.9, or clang 3.7 or
+On Linux, for this guide, we require a minimum of gcc 4.8, or clang 3.7 or
 higher. You can check your version by running
 
 .. code-block:: shell
 
    $ gcc --version
 
-On Ubuntu 16.04 and higher, you can obtain gcc 4.9 with:
+If the system compiler is older than gcc 4.8, it can be set to a newer version
+using the ``$CC`` and ``$CXX`` environment variables:
 
 .. code-block:: shell
 
-   $ sudo apt-get install g++-4.9
-
-Finally, set gcc 4.9 as the active compiler using:
-
-.. code-block:: shell
-
-   export CC=gcc-4.9
-   export CXX=g++-4.9
+   export CC=gcc-4.8
+   export CXX=g++-4.8
 
 Environment Setup and Build
 ---------------------------
@@ -74,7 +139,7 @@ Using Conda
 ~~~~~~~~~~~
 
 Let's create a conda environment with all the C++ build and Python dependencies
-from conda-forge:
+from conda-forge, targeting development for Python 3.7:
 
 On Linux and OSX:
 
@@ -84,22 +149,23 @@ On Linux and OSX:
         --file arrow/ci/conda_env_unix.yml \
         --file arrow/ci/conda_env_cpp.yml \
         --file arrow/ci/conda_env_python.yml \
-        python=3.6
+        compilers \
+        python=3.7
+
+As of January 2019, the `compilers` package is needed on many Linux 
distributions to use packages from conda-forge.
+
+With this out of the way, you can now activate the conda environment
 
    conda activate pyarrow-dev
 
-For Windows, see the `Developing on Windows`_ section below.
+For Windows, see the `Building on Windows`_ section below.
 
 We need to set some environment variables to let Arrow's build system know
 about our build toolchain:
 
 .. code-block:: shell
 
-   export ARROW_BUILD_TYPE=release
-   export ARROW_BUILD_TOOLCHAIN=$CONDA_PREFIX
    export ARROW_HOME=$CONDA_PREFIX
-   export PARQUET_HOME=$CONDA_PREFIX
-   export BOOST_HOME=$CONDA_PREFIX
 
 Using pip
 ~~~~~~~~~
@@ -161,10 +227,7 @@ about our build toolchain:
 
 .. code-block:: shell
 
-   export ARROW_BUILD_TYPE=release
-
    export ARROW_HOME=$(pwd)/dist
-   export PARQUET_HOME=$(pwd)/dist
    export LD_LIBRARY_PATH=$(pwd)/dist/lib:$LD_LIBRARY_PATH
 
 Build and test
@@ -177,26 +240,34 @@ Now build and install the Arrow C++ libraries:
    mkdir arrow/cpp/build
    pushd arrow/cpp/build
 
-   cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
-         -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
+   cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
          -DCMAKE_INSTALL_LIBDIR=lib \
-         -DARROW_PARQUET=on \
-         -DARROW_PYTHON=on \
-         -DARROW_PLASMA=on \
-         -DARROW_BUILD_TESTS=OFF \
+         -DARROW_FLIGHT=ON \
+         -DARROW_GANDIVA=ON \
+         -DARROW_ORC=ON \
+         -DARROW_PARQUET=ON \
+         -DARROW_PYTHON=ON \
+         -DARROW_PLASMA=ON \
+         -DARROW_BUILD_TESTS=ON \
          ..
    make -j4
    make install
    popd
 
-If you don't want to build and install the Plasma in-memory object store,
-you can omit the ``-DARROW_PLASMA=on`` flag.
-Also, if multiple versions of Python are installed in your environment,
-you may have to pass additional parameters to cmake so that
-it can find the right executable, headers and libraries.
-For example, specifying `-DPYTHON_EXECUTABLE=$VIRTUAL_ENV/bin/python`
-(assuming that you're in virtualenv) enables cmake to choose
-the python executable which you are using.
+Many of these components are optional, and can be switched off by setting them
+to ``OFF``:
+
+* ``ARROW_FLIGHT``: RPC framework
+* ``ARROW_GANDIVA``: LLVM-based expression compiler
+* ``ARROW_ORC``: Support for Apache ORC file format
+* ``ARROW_PARQUET``: Support for Apache Parquet file format
+* ``ARROW_PLASMA``: Shared memory object store
+
+If multiple versions of Python are installed in your environment, you may have
+to pass additional parameters to cmake so that it can find the right
+executable, headers and libraries.  For example, specifying
+`-DPYTHON_EXECUTABLE=$VIRTUAL_ENV/bin/python` (assuming that you're in
+virtualenv) enables cmake to choose the python executable which you are using.
 
 .. note::
 
@@ -210,11 +281,15 @@ Now, build pyarrow:
 .. code-block:: shell
 
    pushd arrow/python
-   python setup.py build_ext --build-type=$ARROW_BUILD_TYPE \
-          --with-parquet --with-plasma --inplace
+   export PYARROW_WITH_FLIGHT=1
+   export PYARROW_WITH_GANDIVA=1
+   export PYARROW_WITH_ORC=1
+   export PYARROW_WITH_PARQUET=1
+   python setup.py build_ext --build-type=$ARROW_BUILD_TYPE --inplace
    popd
 
-If you did not build with plasma, you can omit ``--with-plasma``.
+If you did not build one of the optional components, set the corresponding
+``PYARROW_WITH_$COMPONENT`` environment variable to 0.
 
 You should be able to run the unit tests with:
 
@@ -242,56 +317,25 @@ libraries), one can set ``--bundle-arrow-cpp``:
 
    pip install wheel  # if not installed
    python setup.py build_ext --build-type=$ARROW_BUILD_TYPE \
-          --with-parquet --with-plasma --bundle-arrow-cpp bdist_wheel
-
-Again, if you did not build with plasma, you should omit ``--with-plasma``.
-
-Building with optional ORC integration
---------------------------------------
+          --bundle-arrow-cpp bdist_wheel
 
-To build Arrow with support for the `Apache ORC file format 
<https://orc.apache.org/>`_,
-we recommend the following:
+Building with CUDA support
+~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-#. Install the ORC C++ libraries and tools using ``conda``:
-
-   .. code-block:: shell
-
-      conda install -c conda-forge orc
-
-#. Set ``ORC_HOME`` and ``PROTOBUF_HOME`` to the location of the installed
-   Orc and protobuf C++ libraries, respectively (otherwise Arrow will try
-   to download source versions of those libraries and recompile them):
-
-   .. code-block:: shell
-
-      export ORC_HOME=$CONDA_PREFIX
-      export PROTOBUF_HOME=$CONDA_PREFIX
-
-#. Add ``-DARROW_ORC=on`` to the CMake flags.
-#. Add ``--with-orc`` to the ``setup.py`` flags.
-
-Known issues
-------------
-
-If using packages provided by conda-forge (see "Using Conda" above)
-together with a reasonably recent compiler, you may get "undefined symbol"
-errors when importing pyarrow.  In that case you'll need to force the C++
-ABI version to the older version used by conda-forge binaries:
+The :mod:`pyarrow.cuda` module offers support for using Arrow platform
+components with Nvidia's CUDA-enabled GPU devices. To build with this support,
+pass ``-DARROW_CUDA=ON`` when building the C++ libraries, and set the following
+environment variable when building pyarrow:
 
 .. code-block:: shell
 
-   export CXXFLAGS="-D_GLIBCXX_USE_CXX11_ABI=0"
-   export PYARROW_CXXFLAGS=$CXXFLAGS
+   export PYARROW_WITH_CUDA=1
 
-Be sure to add ``-DCMAKE_CXX_FLAGS=$CXXFLAGS`` to the cmake invocations
-when rebuilding.
+Building on Windows
+===================
 
-Developing on Windows
-=====================
-
-First, we bootstrap a conda environment similar to the `C++ build instructions
-<https://github.com/apache/arrow/blob/master/cpp/apidoc/Windows.md>`_. This
-includes all the dependencies for Arrow and the Apache Parquet C++ libraries.
+First, we bootstrap a conda environment similar to above, but skipping some of
+the Linux/macOS-only packages:
 
 First, starting from fresh clones of Apache Arrow:
 
@@ -313,20 +357,18 @@ Now, we build and install Arrow C++ libraries
 
    mkdir cpp\build
    cd cpp\build
-   set ARROW_BUILD_TOOLCHAIN=%CONDA_PREFIX%\Library
    set ARROW_HOME=C:\thirdparty
    cmake -G "Visual Studio 14 2015 Win64" ^
          -DCMAKE_INSTALL_PREFIX=%ARROW_HOME% ^
          -DCMAKE_BUILD_TYPE=Release ^
          -DARROW_BUILD_TESTS=on ^
          -DARROW_CXXFLAGS="/WX /MP" ^
+         -DARROW_GANDIVA=on ^
          -DARROW_PARQUET=on ^
          -DARROW_PYTHON=on ..
    cmake --build . --target INSTALL --config Release
    cd ..\..
 
-If building with LLVM, also add `-DARROW_GANDIVA=ON`.
-
 After that, we must put the install directory's bin path in our ``%PATH%``:
 
 .. code-block:: shell
@@ -358,7 +400,10 @@ Getting ``python-test.exe`` to run is a bit tricky because 
your
 
 Now ``python-test.exe`` or simply ``ctest`` (to run all tests) should work.
 
-Building the Documentation
-==========================
+Windows Caveats
+---------------
+
+Some components are not supported yet on Windows:
 
-See :ref:`building-docs` for instructions to build the HTML documentation.
+* Flight RPC
+* Plasma
diff --git a/docs/source/index.rst b/docs/source/index.rst
index ea5e3bc..3a5728a 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -18,15 +18,25 @@
 Apache Arrow
 ============
 
-Apache Arrow is a cross-language development platform for in-memory data. It
-specifies a standardized language-independent columnar memory format for flat
-and hierarchical data, organized for efficient analytic operations on modern
-hardware. It also provides computational libraries and zero-copy streaming
-messaging and interprocess communication.
+Apache Arrow is a development platform for in-memory analytics. It contains a
+set of technologies that enable big data systems to process and move data
+fast. It specifies a standardized language-independent columnar memory format
+for flat and hierarchical data, organized for efficient analytic operations on
+modern hardware.
+
+The project is developing a multi-language collection of libraries for solving
+systems problems related to in-memory analytical data processing. This includes
+such topics as:
+
+* Zero-copy shared memory and RPC-based data movement
+* Reading and writing file formats (like CSV, Apache ORC, and Apache Parquet)
+* In-memory analytics and query processing
+
+.. _toc.columnar:
 
 .. toctree::
    :maxdepth: 1
-   :caption: Memory Format
+   :caption: Arrow Columnar Format
 
    format/README
    format/Guidelines
@@ -34,15 +44,19 @@ messaging and interprocess communication.
    format/Metadata
    format/IPC
 
+.. _toc.usage:
+
 .. toctree::
    :maxdepth: 2
-   :caption: Languages
+   :caption: Arrow Libraries
 
    cpp/index
    python/index
 
+.. _toc.development:
+
 .. toctree::
    :maxdepth: 2
-   :caption: Developers
+   :caption: Development and Contributing
 
    developers/index
diff --git a/docs/source/python/benchmarks.rst 
b/docs/source/python/benchmarks.rst
index 12205c5..5ecff1a 100644
--- a/docs/source/python/benchmarks.rst
+++ b/docs/source/python/benchmarks.rst
@@ -15,6 +15,8 @@
 .. specific language governing permissions and limitations
 .. under the License.
 
+.. _python-benchmarks:
+
 Benchmarks
 ==========
 
diff --git a/docs/source/python/index.rst b/docs/source/python/index.rst
index 93c2cda..7f227c5 100644
--- a/docs/source/python/index.rst
+++ b/docs/source/python/index.rst
@@ -47,6 +47,5 @@ files into Arrow structures.
    cuda
    extending
    api
-   development
    getting_involved
    benchmarks
diff --git a/docs/source/python/install.rst b/docs/source/python/install.rst
index deb0786..8a47b4a 100644
--- a/docs/source/python/install.rst
+++ b/docs/source/python/install.rst
@@ -63,4 +63,4 @@ need to install the `Visual C++ Redistributable for Visual 
Studio 2015
 Installing from source
 ----------------------
 
-See :ref:`development`.
+See :ref:`python-development`.
diff --git a/docs/source/python/parquet.rst b/docs/source/python/parquet.rst
index 5422ebe..bd3b349 100644
--- a/docs/source/python/parquet.rst
+++ b/docs/source/python/parquet.rst
@@ -37,7 +37,7 @@ which includes a native, multithreaded C++ adapter to and 
from in-memory Arrow
 data. PyArrow includes Python bindings to this code, which thus enables reading
 and writing Parquet files with pandas as well.
 
-Obtaining PyArrow with Parquet Support
+Obtaining pyarrow with Parquet Support
 --------------------------------------
 
 If you installed ``pyarrow`` with pip or conda, it should be built with Parquet
@@ -49,8 +49,8 @@ support bundled:
 
 If you are building ``pyarrow`` from source, you must use
 ``-DARROW_PARQUET=ON`` when compiling the C++ libraries and enable the Parquet
-extensions when building ``pyarrow``. See the :ref:`Development <development>`
-page for more details.
+extensions when building ``pyarrow``. See the :ref:`Python Development
+<python-development>` page for more details.
 
 Reading and Writing Single Files
 --------------------------------
diff --git a/python/README.md b/python/README.md
index ce7bdde..658deb3 100644
--- a/python/README.md
+++ b/python/README.md
@@ -1,4 +1,4 @@
-<!---
+y<!---
   Licensed to the Apache Software Foundation (ASF) under one
   or more contributor license agreements.  See the NOTICE file
   distributed with this work for additional information
@@ -40,52 +40,13 @@ pip install pyarrow
 
 ## Development
 
-### Coding Style
-
-We follow a similar PEP8-like coding style to the [pandas project][3].
-
-The code must pass `flake8` (available from pip or conda) or it will fail the
-build. Check for style errors before submitting your pull request with:
-
-```
-flake8 .
-flake8 --config=.flake8.cython .
-```
-
-### Building from Source
-
-See the [Development][2] page in the documentation.
-
-### Running the unit tests
-
-We are using [pytest][4] to develop our unit test suite. After building the
-project using `setup.py build_ext --inplace`, you can run its unit tests like
-so:
-
-```bash
-pytest pyarrow
-```
-
-The project has a number of custom command line options for its test
-suite. Some tests are disabled by default, for example. To see all the options,
-run
-
-```bash
-pytest pyarrow --help
-```
-
-and look for the "custom options" section.
-
-For running the benchmarks, see the [Sphinx documentation][5].
+See [Python Development][2] in the documentation subproject.
 
 ### Building the documentation
 
-```bash
-pip install -r ../docs/requirements.txt
-python setup.py build_sphinx -s ../docs/source
-```
+See [documentation build instructions][1] in the documentation subproject.
 
-[2]: 
https://github.com/apache/arrow/blob/master/docs/source/python/development.rst
+[1]: 
https://github.com/apache/arrow/blob/master/docs/source/developers/documentation.rst
+[2]: 
https://github.com/apache/arrow/blob/master/docs/source/developers/python.rst
 [3]: https://github.com/pandas-dev/pandas
-[4]: https://docs.pytest.org/en/latest/
 [5]: https://arrow.apache.org/docs/latest/python/benchmarks.html

[arrow] branch master updated: ARROW-4339: [C++][Python] Developer documentation overhaul for 0.13 release

Reply via email to