[ 
https://issues.apache.org/jira/browse/ARROW-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422801#comment-17422801
 ] 

Carlos O'Ryan commented on ARROW-1231:
--------------------------------------

Thanks for the welcome. Here are a few questions:

* I would prefer to break down this work into smaller PRs. For example, one PR 
could create the {{GCSFileSystem}} class but leave most of the functions 
returning {{Status::Unimplemented()}}.  I think that makes it easier to review 
the code. Is that acceptable?
* Is there a good place for me to read about how to add this new (optional) 
component to your CI systems?  I assume we would want to test this on a variety 
of platforms, but so far I have been stumbling through your scripts and I am 
not sure which ones I should change to test with {{ARROW_GCS}} enabled, and 
which ones I should leave alone.
* It seems that we would want to run an emulator or testbench to verify this 
filesystem works.  I know of two 
([fsouza/fake-gcs-server|https://github.com/fsouza/fake-gcs-server] is 
mentioned above, we also use 
[googleapis/storage-testbench|https://github.com/googleapis/storage-testbench]).
  Any preference on which one I should use? 
* GCS does not have directories or folders. Maybe confusingly, some of Google's 
own tools [pretend|https://cloud.google.com/storage/docs/folders] that it does. 
 Should I try to emulate directories in the filesystem?  Some things are easy 
to emulate (listing all the objects with a common prefix for example), some 
things are impossible (the equivalent of {{stat(2)}} cannot be done as 
directions are simply "common prefixes").  My preference would be to *not* 
emulate them.


> [C++] Add filesystem / IO implementation for Google Cloud Storage
> -----------------------------------------------------------------
>
>                 Key: ARROW-1231
>                 URL: https://issues.apache.org/jira/browse/ARROW-1231
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Wes McKinney
>            Assignee: Carlos O'Ryan
>            Priority: Major
>              Labels: filesystem
>
> See example jumping off point
> https://github.com/tensorflow/tensorflow/tree/master/tensorflow/core/platform/cloud



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to