[
https://issues.apache.org/jira/browse/ARROW-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422801#comment-17422801
]
Carlos O'Ryan commented on ARROW-1231:
--------------------------------------
Thanks for the welcome. Here are a few questions:
* I would prefer to break down this work into smaller PRs. For example, one PR
could create the {{GCSFileSystem}} class but leave most of the functions
returning {{Status::Unimplemented()}}. I think that makes it easier to review
the code. Is that acceptable?
* Is there a good place for me to read about how to add this new (optional)
component to your CI systems? I assume we would want to test this on a variety
of platforms, but so far I have been stumbling through your scripts and I am
not sure which ones I should change to test with {{ARROW_GCS}} enabled, and
which ones I should leave alone.
* It seems that we would want to run an emulator or testbench to verify this
filesystem works. I know of two
([fsouza/fake-gcs-server|https://github.com/fsouza/fake-gcs-server] is
mentioned above, we also use
[googleapis/storage-testbench|https://github.com/googleapis/storage-testbench]).
Any preference on which one I should use?
* GCS does not have directories or folders. Maybe confusingly, some of Google's
own tools [pretend|https://cloud.google.com/storage/docs/folders] that it does.
Should I try to emulate directories in the filesystem? Some things are easy
to emulate (listing all the objects with a common prefix for example), some
things are impossible (the equivalent of {{stat(2)}} cannot be done as
directions are simply "common prefixes"). My preference would be to *not*
emulate them.
> [C++] Add filesystem / IO implementation for Google Cloud Storage
> -----------------------------------------------------------------
>
> Key: ARROW-1231
> URL: https://issues.apache.org/jira/browse/ARROW-1231
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++
> Reporter: Wes McKinney
> Assignee: Carlos O'Ryan
> Priority: Major
> Labels: filesystem
>
> See example jumping off point
> https://github.com/tensorflow/tensorflow/tree/master/tensorflow/core/platform/cloud
--
This message was sent by Atlassian Jira
(v8.3.4#803005)