Nathanael Leaute created ARROW-13699:
----------------------------------------

             Summary: [Python][Doc] Refactor the FileSystem Interface 
documentation
                 Key: ARROW-13699
                 URL: https://issues.apache.org/jira/browse/ARROW-13699
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Documentation
            Reporter: Nathanael Leaute


As a python developer working with different cloud vendors and storage I'd like 
to quickly jump to code examples on how to read and write files for each 
filesystem.

The documentation concerned is python/filesystem: 
[https://arrow.apache.org/docs/python/filesystems.html]

I find the information is a bit scattered and could be improved by having the 
following organisation.
h1. Filesystem Interface

_overview of the Pyarrow FS Interface_
h2. Usage
h3. Local Filesystem

_description_
h4. Writing files

_code example_
h4. Listing files

_code example_
h4. Reading files

_code example_
h3. S3 Filesystem

_description / configuration_
h4. Writing files

_code example_
h4. Listing files

_code example_
h4. Reading files

_code example_
h3. Hadoop Filesystem

_description / configuration_
h4. Writing files

_code example_
h4. Listing files

_code example_
h4. Reading files

_code example_
h3. Extending to fsspec-compatible filesystems

_description_
h4. Google Cloud Storage

_code example_
h4. Azure

_code example_

That way if a developer is working on s3 can directly jump to the section of 
interest and start experimenting with the code examples.
 Additionally if new python bindings are created for a "Arrow native" 
filesystem the documentation can be extended with a new section in same vein as 
the other.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to