coryan commented on a change in pull request #11812:
URL: https://github.com/apache/arrow/pull/11812#discussion_r760205288



##########
File path: cpp/src/arrow/filesystem/gcsfs.cc
##########
@@ -324,17 +407,46 @@ Result<std::shared_ptr<io::InputStream>> 
GcsFileSystem::OpenInputStream(
     return Status::IOError("Only files can be opened as input streams");
   }
   ARROW_ASSIGN_OR_RAISE(auto p, GcsPath::FromString(info.path()));
-  return impl_->OpenInputStream(p);
+  return impl_->OpenInputStream(p.bucket, p.object, gcs::Generation(),
+                                gcs::ReadFromOffset());
 }
 
 Result<std::shared_ptr<io::RandomAccessFile>> GcsFileSystem::OpenInputFile(
     const std::string& path) {
-  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+  ARROW_ASSIGN_OR_RAISE(auto p, GcsPath::FromString(path));
+  auto metadata = impl_->GetObjectMetadata(p);

Review comment:
       Yes, it does.  I am trying to ensure that `Read()` and `ReadAt()` and 
`Seek()` when going back are using the same generation of an object [*].  We 
could try to use undocumented (and likely to break) APIs to extract the 
generation without this roundtrip.  If it turns out the roundtrip (and I should 
add, the additional API charges) are really important, then I would rather add 
a documented API to the C++ client library and then use that here.
   
   
   [*]: you probably know this, but objects in GCS are versioned.  You can have 
more than one version of the same object, and/or have the "latest" version 
replaced while you are reading from it.  I would think we want all operations 
in one `io::RandomAccessFile` to refer to the same generation.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to