[ 
https://issues.apache.org/jira/browse/BEAM-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16260132#comment-16260132
 ] 

Udi Meiri commented on BEAM-3099:
---------------------------------

Doc exploring Python HDFS library options: 
https://docs.google.com/document/d/1-uzKf4VPlGrkBMXM00sxxf3K01Ss3ZzXeju0w5L0LY0/edit?usp=sharing

> Implement HDFS FileSystem for Python SDK
> ----------------------------------------
>
>                 Key: BEAM-3099
>                 URL: https://issues.apache.org/jira/browse/BEAM-3099
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Chamikara Jayalath
>            Assignee: Udi Meiri
>
> Currently Java SDK has HDFS support but Python SDK does not. With current 
> portability efforts other runners may soon be able to use Python SDK. Having 
> HDFS support will allow these runners to execute large scale jobs without 
> using GCS. 
> Following suggests some libraries that can be used to connect to HDFS from 
> Python.
> http://wesmckinney.com/blog/python-hdfs-interfaces/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to