jorisvandenbossche commented on code in PR #41135: URL: https://github.com/apache/arrow/pull/41135#discussion_r1599827310
########## docs/source/python/install.rst: ########## @@ -107,17 +107,41 @@ a custom path to the database from Python: Differences between conda-forge packages ---------------------------------------- -PyArrow is packaged on `conda-forge <https://conda-forge.org/>`_ as three +On `conda-forge <https://conda-forge.org/>`_, PyArrow is published as three separate packages, each providing varying levels of functionality. This is in contrast to PyPi, where only a single PyArrow package is provided. The purpose of this split is to minimize the size of the installed package for most users (``pyarrow``), provide a smaller, minimal package for specialized use cases (``pyarrow-core``), while still providing a complete package for users who -require it (``pyarrow-all``). +require it (``pyarrow-all``). What was historically ``pyarrow`` on +`conda-forge <https://conda-forge.org/>`_ is now ``pyarrow-all``, though most +users can continue using ``pyarrow``. -The table below lists the functionality provided by each package and may be -useful when deciding to use one package over another: +The ``pyarrow-core`` package includes the following functionality: + +- :ref:`data` +- :ref:`compute` (i.e., ``pyarrow.compute``) +- :ref:`io` +- :ref:`ipc` (i.e., ``pyarrow.ipc``) +- :ref:`filesystem` (HDFS, S3, GCS, etc.) Review Comment: `pyarrow.fs` itself is always available, though (with at a minimum just the LocalFileSystem) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
