[ https://issues.apache.org/jira/browse/BEAM-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16260132#comment-16260132 ]
Udi Meiri commented on BEAM-3099: --------------------------------- Doc exploring Python HDFS library options: https://docs.google.com/document/d/1-uzKf4VPlGrkBMXM00sxxf3K01Ss3ZzXeju0w5L0LY0/edit?usp=sharing > Implement HDFS FileSystem for Python SDK > ---------------------------------------- > > Key: BEAM-3099 > URL: https://issues.apache.org/jira/browse/BEAM-3099 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core > Reporter: Chamikara Jayalath > Assignee: Udi Meiri > > Currently Java SDK has HDFS support but Python SDK does not. With current > portability efforts other runners may soon be able to use Python SDK. Having > HDFS support will allow these runners to execute large scale jobs without > using GCS. > Following suggests some libraries that can be used to connect to HDFS from > Python. > http://wesmckinney.com/blog/python-hdfs-interfaces/ -- This message was sent by Atlassian JIRA (v6.4.14#64029)