numbworks commented on issue #39846:
URL: https://github.com/apache/arrow/issues/39846#issuecomment-2294864651

   > you have a different problem, you don't have cmake installed. Building 
Pyarrow from source needs a matching version of Arrow C++ available as well.
   
   @assignUser 
   Hey Jacob, thank you for your answer, but it seems from the thread that your 
solution doesn't work. 
   
   Do you have a Dockerfile example that demonstrates that your proposal works?
   
   I also read the following answer from you in another thread:
   
   > Please see https://github.com/apache/arrow/issues/18036, we don't publish 
musl wheels at the moment.
   
   Are there plan to change this? Because, I don't know if you are aware of it, 
but **PyArrow will be a mandatory dependency for Pandas starting Pandas 
v3.0.0** - please read more here: 
[https://pandas.pydata.org/pdeps/0010-required-pyarrow-dependency.html](https://pandas.pydata.org/pdeps/0010-required-pyarrow-dependency.html).
   
   One of the official Python images on Docker Hub is based on Alpine Linux, 
which it's also the more optimized on a resources perspective. The lack of 
PyArrow wheels for Alpine means that, starting Pandas 3.0.0 (maybe in six 
months from now), thousands of Python developers and data scientists won't be 
able to do their work in a containerized environment.
   
   The only alternative at the moment is to use the Debian-based image on 
Python's Docker Hub, which it's 15x more resource hungry than Alpine:
   
   ```
   FROM python:3.12.5-bullseye
   
   RUN pip install --upgrade pip \
       && pip install numpy==1.26.3 \
       && pip install pyarrow==15.0.0 \ 
       && pip install openpyxl==3.1.0 \ 
       && pip install pandas==2.2.0 \ 
   ```
   
   I hope you can discuss this matter within the team and assign the right 
priority to it.
   
   Thank you. 
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to