I have Apache Arrow on Docker (outside Hadoop cluster)
Write to HDFS using pyArrow
<https://arrow.apache.org/docs/python/filesystems.html#hadoop-file-system-hdfs>
can be difficult  solve in Docker all dependencies (Hadoop_home, Java_home,
... )

I don't know if use Native RPC access in Python
<http://wesmckinney.com/blog/python-hdfs-interfaces/> is my best option or
there is some alternative (HDFS-8707
<https://issues.apache.org/jira/browse/HDFS-8707>)

Some suggestion ??

Reply via email to