[ 
https://issues.apache.org/jira/browse/ARROW-17916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612518#comment-17612518
 ] 

Joris Van den Bossche edited comment on ARROW-17916 at 10/4/22 8:28 AM:
------------------------------------------------------------------------

Ah, yes, for Dataset and HDFS I was thinking about the cython level, not for 
our C\+\+. But checking pyarrow/src, I don't think we actually require dataset 
or hdfs in pyarrow C\+\+. 
I think the scope of {{ARROW_PYTHON}} was also partly convenience to build the 
"common" things, and not strictly the required parts. 


was (Author: jorisvandenbossche):
Ah, yes, for Dataset and HDFS I was thinking about the cython level, not for 
our C++. But checking pyarrow/src, I don't think we actually require dataset or 
hdfs in pyarrow C++. 
I think the scope of {{ARROW_PYTHON}} was also partly convenience to build the 
"common" things, and not strictly the required parts. 

> [Python] Allow disabling more components
> ----------------------------------------
>
>                 Key: ARROW-17916
>                 URL: https://issues.apache.org/jira/browse/ARROW-17916
>             Project: Apache Arrow
>          Issue Type: Wish
>          Components: Python
>    Affects Versions: 9.0.0
>            Reporter: Antoine Pitrou
>            Priority: Major
>             Fix For: 11.0.0
>
>
> Some users would like to build lightweight versions of PyArrow, for example 
> for use in AWS Lambda or similar systems which constrain the total size of 
> usable libraries.
> However, PyArrow currently mandates some Arrow C++ components which can lead 
> to a very sizable Arrow binary install: Compute, CSV, Dataset, Filesystem, 
> HDFS and JSON.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to