Installing the Hadoop client Java dependencies in Docker isn't sooo bad if you start from a known working solution, e.g. something like:
* https://github.com/dask/hdfs3/blob/master/continuous_integration/Dockerfile It would be nice to be able to support libhdfs++ in Arrow, but someone would need to step up and champion that development work On Tue, Sep 25, 2018 at 7:08 AM Alberto Ramón <[email protected]> wrote: > > I have Apache Arrow on Docker (outside Hadoop cluster) > Write to HDFS using pyArrow can be difficult solve in Docker all > dependencies (Hadoop_home, Java_home, ... ) > > I don't know if use Native RPC access in Python is my best option or there is > some alternative (HDFS-8707) > > Some suggestion ??
