[
https://issues.apache.org/jira/browse/ARROW-17916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612518#comment-17612518
]
Joris Van den Bossche edited comment on ARROW-17916 at 10/4/22 8:28 AM:
------------------------------------------------------------------------
Ah, yes, for Dataset and HDFS I was thinking about the cython level, not for
our C\+\+. But checking pyarrow/src, I don't think we actually require dataset
or hdfs in pyarrow C\+\+.
I think the scope of {{ARROW_PYTHON}} was also partly convenience to build the
"common" things, and not strictly the required parts.
was (Author: jorisvandenbossche):
Ah, yes, for Dataset and HDFS I was thinking about the cython level, not for
our C++. But checking pyarrow/src, I don't think we actually require dataset or
hdfs in pyarrow C++.
I think the scope of {{ARROW_PYTHON}} was also partly convenience to build the
"common" things, and not strictly the required parts.
> [Python] Allow disabling more components
> ----------------------------------------
>
> Key: ARROW-17916
> URL: https://issues.apache.org/jira/browse/ARROW-17916
> Project: Apache Arrow
> Issue Type: Wish
> Components: Python
> Affects Versions: 9.0.0
> Reporter: Antoine Pitrou
> Priority: Major
> Fix For: 11.0.0
>
>
> Some users would like to build lightweight versions of PyArrow, for example
> for use in AWS Lambda or similar systems which constrain the total size of
> usable libraries.
> However, PyArrow currently mandates some Arrow C++ components which can lead
> to a very sizable Arrow binary install: Compute, CSV, Dataset, Filesystem,
> HDFS and JSON.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)