[
https://issues.apache.org/jira/browse/ARROW-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17393269#comment-17393269
]
Martin Durant commented on ARROW-1319:
--------------------------------------
> how it can fit into the current API
I mean, it's not really up me. It's something that HDFS allows you to do that
some people may find useful for file-system operations on HDFS. Yes, hdfs3 had
this functionality.
> the filesystem API is mostly meant for Arrow purposes of reading and writing
>datasets
!! I thought this was meant to be a general-purpose, cross-platform file-system
interface for the supported backends. pyarrow is the *only* way for python
users to interact with HDFS. If they can't make delegation tokens with this
interface, they won't be able to anywhere else. Other functionality falls into
this bucket too, such as setting the number of replications of some file.
> [Python] Add additional HDFS filesystem methods
> -----------------------------------------------
>
> Key: ARROW-1319
> URL: https://issues.apache.org/jira/browse/ARROW-1319
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Martin Durant
> Priority: Major
> Labels: HDFS, filesystem
>
> The python library hdfs3 http://hdfs3.readthedocs.io/en/latest/api.html
> contains a wider set of file-system methods than arrow's python bindings.
> These are probably simple to implement for arrow-hdfs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)