HyunWooZZ commented on code in PR #46416:
URL: https://github.com/apache/arrow/pull/46416#discussion_r2088351982
##########
cpp/src/arrow/filesystem/gcsfs.cc:
##########
@@ -353,7 +353,10 @@ class GcsFileSystem::Impl {
// matches the prefix we assume it is a directory.
std::string canonical = internal::EnsureTrailingSlash(path.object);
auto list_result = client_.ListObjects(path.bucket,
gcs::Prefix(canonical));
- if (list_result.begin() != list_result.end()) {
+
+ // Check that the result is valid before determining whether the list is
empty.
+ auto it = list_result.begin();
+ if (it != list_result.end() && *it) {
Review Comment:
@kou Thank ouy reply!!
I think I need to clearly define the problem first.
The issue I'm currently experiencing is that even though I don't have the
appropriate permissions for GCS, instead of throwing an error, it returns the
file as if it were a directory.
The function logic is as follows:
1. It checks whether the file exists in the bucket and stores that
information in info.
- In GetFileInfoObject, it seems to return FileInfo(path.full_path,
FileType::NotFound).
2. Therefore, the info.ok() is true and FileType is NotFound, it does not
enter the following branch:
```cpp
if (!info.ok() || info->type() != FileType::NotFound)
```
3. As a result, it proceeds to check if there are any objects under full
path/.
4. In step 3, it should first verify if the iterator object is valid before
comparing begin and end. However, it doesn't perform this check, leading to the
issue.
My test code.
```cpp
#include "google/cloud/storage/client.h"
#include "google/cloud/status.h"
#include <iostream>
#include <stdexcept>
namespace gcs = ::google::cloud::storage;
int main(int argc, char* argv[]) {
if (argc != 2) {
std::cerr << "Usage: " << argv[0] << " <bucket-name>\n";
return 1;
}
std::string bucket_name = argv[1];
auto client = gcs::Client();
std::string object_name = "test-object";
auto meta = client.GetObjectMetadata(bucket_name, object_name);
if (!meta) {
std::cout << "Error Code: " << static_cast<int>(meta.status().code())
<< std::endl;
std::cout << "Status string: " <<
google::cloud::StatusCodeToString(meta.status().code()) << std::endl;
std::cerr << "object_meta: " << meta.status().message() << "\n";
return 1;
}
return 0;
}
```
My test result.
```
Error Code: 5
Status string: NOT_FOUND
object_meta: Permanent error, with a last message of Could not create a
OAuth2 access token to authenticate the request. The request was not sent, as
such an access token is required to complete the request successfully. Learn
more about Google Cloud authentication at
https://cloud.google.com/docs/authentication. The underlying error message was:
{"error":"invalid_request","error_description":"Service account not enabled on
this instance"}
```
##########
cpp/src/arrow/filesystem/gcsfs.cc:
##########
@@ -353,7 +353,10 @@ class GcsFileSystem::Impl {
// matches the prefix we assume it is a directory.
std::string canonical = internal::EnsureTrailingSlash(path.object);
auto list_result = client_.ListObjects(path.bucket,
gcs::Prefix(canonical));
- if (list_result.begin() != list_result.end()) {
+
+ // Check that the result is valid before determining whether the list is
empty.
+ auto it = list_result.begin();
+ if (it != list_result.end() && *it) {
Review Comment:
I tested `auto meta = client_.GetObjectMetadata(path.bucket, path.object);`
part
this return aboce test result.
##########
cpp/src/arrow/filesystem/gcsfs.cc:
##########
@@ -353,7 +353,10 @@ class GcsFileSystem::Impl {
// matches the prefix we assume it is a directory.
std::string canonical = internal::EnsureTrailingSlash(path.object);
auto list_result = client_.ListObjects(path.bucket,
gcs::Prefix(canonical));
- if (list_result.begin() != list_result.end()) {
+
+ // Check that the result is valid before determining whether the list is
empty.
+ auto it = list_result.begin();
+ if (it != list_result.end() && *it) {
Review Comment:
I will remind to coryan!
Hi @coryan:)
Do you have any idea of this issue?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]