coryan commented on a change in pull request #11331:
URL: https://github.com/apache/arrow/pull/11331#discussion_r724421894



##########
File path: cpp/src/arrow/filesystem/gcsfs.cc
##########
@@ -0,0 +1,143 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/filesystem/gcsfs.h"
+
+#include <google/cloud/storage/client.h>
+
+#include <sstream>
+
+#include "arrow/filesystem/path_util.h"
+#include "arrow/result.h"
+#include "arrow/util/checked_cast.h"
+
+namespace arrow {
+namespace fs {
+
+namespace gcs = google::cloud::storage;
+
+google::cloud::Options AsGoogleCloudOptions(GcsOptions const& o) {

Review comment:
       Thanks.  I did not mean to restart the "east const" vs. "const west" 
holy way, but muscle memory is what it is.

##########
File path: cpp/src/arrow/filesystem/gcsfs.cc
##########
@@ -0,0 +1,143 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/filesystem/gcsfs.h"
+
+#include <google/cloud/storage/client.h>
+
+#include <sstream>
+
+#include "arrow/filesystem/path_util.h"
+#include "arrow/result.h"
+#include "arrow/util/checked_cast.h"
+
+namespace arrow {
+namespace fs {
+
+namespace gcs = google::cloud::storage;
+
+google::cloud::Options AsGoogleCloudOptions(GcsOptions const& o) {
+  auto options = google::cloud::Options{};

Review comment:
       "Almost Always Auto", another holy war. I can change it if you think it 
is confusing.
   

##########
File path: cpp/src/arrow/filesystem/gcsfs.cc
##########
@@ -0,0 +1,143 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/filesystem/gcsfs.h"
+
+#include <google/cloud/storage/client.h>
+
+#include <sstream>
+
+#include "arrow/filesystem/path_util.h"
+#include "arrow/result.h"
+#include "arrow/util/checked_cast.h"
+
+namespace arrow {
+namespace fs {
+
+namespace gcs = google::cloud::storage;
+
+google::cloud::Options AsGoogleCloudOptions(GcsOptions const& o) {
+  auto options = google::cloud::Options{};
+  if (!o.endpoint_override.empty()) {
+    auto scheme = o.scheme;

Review comment:
       Done.

##########
File path: cpp/src/arrow/filesystem/gcsfs.h
##########
@@ -0,0 +1,130 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "arrow/filesystem/filesystem.h"
+
+namespace arrow {
+namespace fs {
+class GcsFileSystem;
+struct GcsOptions;
+namespace internal {
+// TODO(ARROW-1231) - during development only tests should create a 
GcsFileSystem.
+//     Remove, and provide a public API, before declaring the feature complete.
+std::shared_ptr<GcsFileSystem> MakeGcsFileSystemForTest(const GcsOptions& 
options);

Review comment:
       Fixed to make the TODO more explicit

##########
File path: cpp/src/arrow/filesystem/gcsfs.h
##########
@@ -0,0 +1,130 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "arrow/filesystem/filesystem.h"
+
+namespace arrow {
+namespace fs {
+class GcsFileSystem;
+struct GcsOptions;
+namespace internal {
+// TODO(ARROW-1231) - during development only tests should create a 
GcsFileSystem.
+//     Remove, and provide a public API, before declaring the feature complete.
+std::shared_ptr<GcsFileSystem> MakeGcsFileSystemForTest(const GcsOptions& 
options);

Review comment:
       I hope that is what you had in mind.
   

##########
File path: cpp/src/arrow/filesystem/gcsfs.cc
##########
@@ -0,0 +1,143 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/filesystem/gcsfs.h"
+
+#include <google/cloud/storage/client.h>
+
+#include <sstream>
+
+#include "arrow/filesystem/path_util.h"
+#include "arrow/result.h"
+#include "arrow/util/checked_cast.h"
+
+namespace arrow {
+namespace fs {
+
+namespace gcs = google::cloud::storage;
+
+google::cloud::Options AsGoogleCloudOptions(GcsOptions const& o) {
+  auto options = google::cloud::Options{};
+  if (!o.endpoint_override.empty()) {
+    auto scheme = o.scheme;
+    if (scheme.empty()) scheme = "https";
+    options.set<gcs::RestEndpointOption>(scheme + "://" + o.endpoint_override);
+  }
+  return options;
+}
+
+class GcsFileSystem::Impl {
+ public:
+  explicit Impl(GcsOptions const& o) : client_(AsGoogleCloudOptions(o)) {}
+
+ private:
+  gcs::Client client_;
+};
+
+std::string GcsFileSystem::type_name() const { return "gcs"; }
+
+bool GcsFileSystem::Equals(const FileSystem& other) const {
+  if (this == &other) {
+    return true;
+  }
+  if (other.type_name() != type_name()) {
+    return false;
+  }
+  const auto& fs = ::arrow::internal::checked_cast<const 
GcsFileSystem&>(other);
+  return impl_ == fs.impl_;
+}
+
+Result<FileInfo> GcsFileSystem::GetFileInfo(const std::string& path) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Result<FileInfoVector> GcsFileSystem::GetFileInfo(const FileSelector& select) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Status GcsFileSystem::CreateDir(const std::string& path, bool recursive) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Status GcsFileSystem::DeleteDir(const std::string& path) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Status GcsFileSystem::DeleteDirContents(const std::string& path) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Status GcsFileSystem::DeleteRootDirContents() {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Status GcsFileSystem::DeleteFile(const std::string& path) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Status GcsFileSystem::Move(const std::string& src, const std::string& dest) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Status GcsFileSystem::CopyFile(const std::string& src, const std::string& 
dest) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Result<std::shared_ptr<io::InputStream>> GcsFileSystem::OpenInputStream(
+    const std::string& path) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Result<std::shared_ptr<io::InputStream>> GcsFileSystem::OpenInputStream(
+    const FileInfo& info) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Result<std::shared_ptr<io::RandomAccessFile>> GcsFileSystem::OpenInputFile(
+    const std::string& path) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Result<std::shared_ptr<io::RandomAccessFile>> GcsFileSystem::OpenInputFile(
+    const FileInfo& info) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Result<std::shared_ptr<io::OutputStream>> GcsFileSystem::OpenOutputStream(
+    const std::string& path, const std::shared_ptr<const KeyValueMetadata>& 
metadata) {
+  return Status::NotImplemented("The GCS FileSystem is not fully implemented");
+}
+
+Result<std::shared_ptr<io::OutputStream>> GcsFileSystem::OpenAppendStream(
+    const std::string&, const std::shared_ptr<const KeyValueMetadata>&) {
+  return Status::NotImplemented("Append is not supported in GCS");
+}
+
+GcsFileSystem::GcsFileSystem(const GcsOptions& options, const io::IOContext& 
context)
+    : FileSystem(context), impl_(std::make_shared<Impl>(options)) {}
+
+namespace internal {
+
+std::shared_ptr<GcsFileSystem> MakeGcsFileSystemForTest(const GcsOptions& 
options) {
+  return std::shared_ptr<GcsFileSystem>(

Review comment:
       Done.

##########
File path: cpp/src/arrow/filesystem/gcsfs_test.cc
##########
@@ -0,0 +1,51 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/filesystem/gcsfs.h"
+
+#include <gmock/gmock-matchers.h>
+#include <gmock/gmock-more-matchers.h>
+#include <gtest/gtest.h>
+
+#include <string>
+
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/util.h"
+
+namespace arrow {
+namespace fs {
+namespace {
+
+using ::testing::IsEmpty;
+using ::testing::Not;
+using ::testing::NotNull;
+
+TEST(GCSFileSystem, Compare) {
+  auto a = internal::MakeGcsFileSystemForTest(GcsOptions{});
+  EXPECT_THAT(a.get(), NotNull());
+  EXPECT_EQ(a, a);
+
+  auto b = internal::MakeGcsFileSystemForTest(GcsOptions{});
+  EXPECT_THAT(b.get(), NotNull());
+  EXPECT_EQ(b, b);
+
+  EXPECT_NE(a, b);

Review comment:
       I do not know what "logically equals" means.  At a guess something like 
"modulo transient errors, both produce the same results"?  If so, I am not sure 
there is a reliable way to do that: consider `storage.googleapis.com` vs. 
`private.googleapis.com` as endpoint overrides, different strings, but 
depending on your environment, exactly the same behavior (or not!).
   
   Anyway, implemented something closer to the current implementation for the 
S3 FileSystem.

##########
File path: cpp/src/arrow/filesystem/gcsfs_test.cc
##########
@@ -0,0 +1,51 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/filesystem/gcsfs.h"
+
+#include <gmock/gmock-matchers.h>
+#include <gmock/gmock-more-matchers.h>
+#include <gtest/gtest.h>
+
+#include <string>
+
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/util.h"
+
+namespace arrow {
+namespace fs {
+namespace {
+
+using ::testing::IsEmpty;
+using ::testing::Not;
+using ::testing::NotNull;
+
+TEST(GCSFileSystem, Compare) {
+  auto a = internal::MakeGcsFileSystemForTest(GcsOptions{});
+  EXPECT_THAT(a.get(), NotNull());
+  EXPECT_EQ(a, a);
+
+  auto b = internal::MakeGcsFileSystemForTest(GcsOptions{});
+  EXPECT_THAT(b.get(), NotNull());

Review comment:
       Doh, fixed.

##########
File path: cpp/src/arrow/filesystem/gcsfs_test.cc
##########
@@ -0,0 +1,51 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/filesystem/gcsfs.h"
+
+#include <gmock/gmock-matchers.h>
+#include <gmock/gmock-more-matchers.h>
+#include <gtest/gtest.h>
+
+#include <string>
+
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/util.h"
+
+namespace arrow {
+namespace fs {
+namespace {
+
+using ::testing::IsEmpty;
+using ::testing::Not;
+using ::testing::NotNull;
+
+TEST(GCSFileSystem, Compare) {
+  auto a = internal::MakeGcsFileSystemForTest(GcsOptions{});
+  EXPECT_THAT(a.get(), NotNull());
+  EXPECT_EQ(a, a);

Review comment:
       Actually we want `a->Equals()` fixed.

##########
File path: cpp/CMakeLists.txt
##########
@@ -801,7 +801,11 @@ endif()
 set(ARROW_SHARED_PRIVATE_LINK_LIBS ${ARROW_STATIC_LINK_LIBS})
 
 # boost::filesystem is needed for S3 and Flight tests as a boost::process 
dependency.
-if(((ARROW_FLIGHT OR ARROW_S3) AND (ARROW_BUILD_TESTS OR 
ARROW_BUILD_INTEGRATION)))
+if(((ARROW_FLIGHT
+     OR ARROW_S3

Review comment:
       I do not think is this change, as the build has `-DARROW_GCS=OFF`:
   
   
https://github.com/apache/arrow/pull/11331/checks?check_run_id=3830637295#step:9:62
   
   and the change is (modulo reformatting) changing `(ARROW_FLIGHT OR 
ARROW_S3)` to `(ARROW_FLIGHT OR ARROW_S3 OR ARROW_GCS)`.  In addition, this 
change has been there since the first commit in the branch. 
   
   At a guess this is either a transient or the point in `master` where I based 
the branch has a problem, I can rebase if you think that would help.

##########
File path: cpp/CMakeLists.txt
##########
@@ -801,7 +801,11 @@ endif()
 set(ARROW_SHARED_PRIVATE_LINK_LIBS ${ARROW_STATIC_LINK_LIBS})
 
 # boost::filesystem is needed for S3 and Flight tests as a boost::process 
dependency.
-if(((ARROW_FLIGHT OR ARROW_S3) AND (ARROW_BUILD_TESTS OR 
ARROW_BUILD_INTEGRATION)))
+if(((ARROW_FLIGHT
+     OR ARROW_S3

Review comment:
       Looking at the builds at `master`, I do not think a rebase would help, 
and it seems even less likely that these changes caused the build breaks:
   
   https://github.com/apache/arrow/actions/runs/1317260919
   
   https://github.com/apache/arrow/actions?query=event%3Apush+branch%3Amaster




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to