[GitHub] [arrow] corleyma commented on a change in pull request #7803: ARROW-9517: [C++/Python] Add optional session_token to S3Options

2020-08-11 Thread GitBox


corleyma commented on a change in pull request #7803:
URL: https://github.com/apache/arrow/pull/7803#discussion_r468369686



##
File path: cpp/src/arrow/filesystem/s3fs.h
##
@@ -62,10 +75,18 @@ struct ARROW_EXPORT S3Options {
   void ConfigureAnonymousCredentials();
 
   /// Configure with explicit access and secret key.
-  void ConfigureAccessKey(const std::string& access_key, const std::string& 
secret_key);
+  void ConfigureAccessKey(const std::string& access_key, const std::string& 
secret_key,
+  const std::string& session_token = "");
+
+  /// Configure with credentials from an assumed role.
+  void ConfigureAssumeRoleCredentials(
+  const std::string& role_arn, const std::string& session_name = "",
+  const std::string& external_id = "", int load_frequency = 900,
+  const std::shared_ptr& stsClient = nullptr);

Review comment:
   I inferred this is something to do with compliance with C++/CLI; 
replaced with existing NULLPTR macro.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] corleyma commented on a change in pull request #7803: ARROW-9517: [C++/Python] Add optional session_token to S3Options

2020-08-10 Thread GitBox


corleyma commented on a change in pull request #7803:
URL: https://github.com/apache/arrow/pull/7803#discussion_r468195140



##
File path: cpp/src/arrow/filesystem/s3fs.h
##
@@ -62,10 +75,18 @@ struct ARROW_EXPORT S3Options {
   void ConfigureAnonymousCredentials();
 
   /// Configure with explicit access and secret key.
-  void ConfigureAccessKey(const std::string& access_key, const std::string& 
secret_key);
+  void ConfigureAccessKey(const std::string& access_key, const std::string& 
secret_key,
+  const std::string& session_token = "");
+
+  /// Configure with credentials from an assumed role.
+  void ConfigureAssumeRoleCredentials(
+  const std::string& role_arn, const std::string& session_name = "",
+  const std::string& external_id = "", int load_frequency = 900,
+  const std::shared_ptr& stsClient = nullptr);

Review comment:
   `cpplint` does not like this, but I am just mirroring the default value 
used by the SDK.  Should I suppress the violation or is there indeed a better 
practice?  I'm a C++ neophyte so pointers (ha!) appreciated.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] corleyma commented on a change in pull request #7803: ARROW-9517: [C++/Python] Add optional session_token to S3Options

2020-08-10 Thread GitBox


corleyma commented on a change in pull request #7803:
URL: https://github.com/apache/arrow/pull/7803#discussion_r468174939



##
File path: cpp/src/arrow/filesystem/s3fs_test.cc
##
@@ -197,6 +197,32 @@ TEST(S3Options, FromUri) {
   ASSERT_RAISES(Invalid, S3Options::FromUri("s3:///foo/bar/", ));
 }
 
+TEST(S3Options, FromAccessKey) {
+  S3Options options;
+
+  // session token is optional and should default to empty string
+  options = S3Options::FromAccessKey("access", "secret");
+  ASSERT_EQ(options.GetAccessKey(), "access");
+  ASSERT_EQ(options.GetSecretKey(), "secret");
+  ASSERT_EQ(options.GetSessionToken(), "");
+
+  options = S3Options::FromAccessKey("access", "secret", "token");
+  ASSERT_EQ(options.GetAccessKey(), "access");
+  ASSERT_EQ(options.GetSecretKey(), "secret");
+  ASSERT_EQ(options.GetSessionToken(), "token");
+}
+
+TEST(S3Options, FromAssumeRole) {
+  S3Options options;
+
+  // the only required argument should be role arn
+  options = S3Options::FromAssumeRole("my_role_arn");

Review comment:
   I added 2 relevant changes:
   - I updated the test to set an env var that will cause 
DefaultAWSCredentialsProviderChain to skip trying the EC2 instance metadata 
endpoint
   - I added the ability to provide your own STSClient instead of using the 
default configuration, so users can choose to use something other than e.g. the 
DefaultAWSCredentialsProviderChain.
   The test runtime is down from ~5s to ~3ms.

##
File path: cpp/src/arrow/filesystem/s3fs_test.cc
##
@@ -197,6 +197,32 @@ TEST(S3Options, FromUri) {
   ASSERT_RAISES(Invalid, S3Options::FromUri("s3:///foo/bar/", ));
 }
 
+TEST(S3Options, FromAccessKey) {
+  S3Options options;
+
+  // session token is optional and should default to empty string
+  options = S3Options::FromAccessKey("access", "secret");
+  ASSERT_EQ(options.GetAccessKey(), "access");
+  ASSERT_EQ(options.GetSecretKey(), "secret");
+  ASSERT_EQ(options.GetSessionToken(), "");
+
+  options = S3Options::FromAccessKey("access", "secret", "token");
+  ASSERT_EQ(options.GetAccessKey(), "access");
+  ASSERT_EQ(options.GetSecretKey(), "secret");
+  ASSERT_EQ(options.GetSessionToken(), "token");
+}
+
+TEST(S3Options, FromAssumeRole) {
+  S3Options options;
+
+  // the only required argument should be role arn
+  options = S3Options::FromAssumeRole("my_role_arn");

Review comment:
   I added 2 relevant changes:
   - I updated the test to set an env var that will cause 
DefaultAWSCredentialsProviderChain to skip trying the EC2 instance metadata 
endpoint
   - I added the ability to provide your own STSClient instead of using the 
default configuration, so users can choose to use something other than e.g. the 
DefaultAWSCredentialsProviderChain.
   
   The test runtime is down from ~5s to ~3ms.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] corleyma commented on a change in pull request #7803: ARROW-9517: [C++/Python] Add optional session_token to S3Options

2020-08-10 Thread GitBox


corleyma commented on a change in pull request #7803:
URL: https://github.com/apache/arrow/pull/7803#discussion_r467822801



##
File path: cpp/src/arrow/filesystem/s3fs.h
##
@@ -62,10 +62,12 @@ struct ARROW_EXPORT S3Options {
   void ConfigureAnonymousCredentials();
 
   /// Configure with explicit access and secret key.
-  void ConfigureAccessKey(const std::string& access_key, const std::string& 
secret_key);
+  void ConfigureAccessKey(const std::string& access_key, const std::string& 
secret_key,
+  const std::string& session_token = "");
 
   std::string GetAccessKey() const;
   std::string GetSecretKey() const;
+  std::string GetSessionToken() const;

Review comment:
   I added C++ tests to `s3fs_test.cc` to cover these changes.  
Unfortunately, adding support for STS-like temporary credentials to Minio 
involves addition of another dependency, Keycloak.  I spent a bit of time 
reading relevant docs but could not seem to find a programmatic way to 
replicate the desired behavior (the Keycloak guides I found were very 
UI-driven).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] corleyma commented on a change in pull request #7803: ARROW-9517: [C++/Python] Add optional session_token to S3Options

2020-08-10 Thread GitBox


corleyma commented on a change in pull request #7803:
URL: https://github.com/apache/arrow/pull/7803#discussion_r467821450



##
File path: python/pyarrow/_s3fs.pyx
##
@@ -105,9 +105,14 @@ cdef class S3FileSystem(FileSystem):
 raise ValueError(
 'Cannot pass anonymous=True together with access_key '
 'and secret_key.')
+
+if session_token is None:
+session_token = ""
+
 options = CS3Options.FromAccessKey(
 tobytes(access_key),
-tobytes(secret_key)
+tobytes(secret_key),
+tobytes(session_token)

Review comment:
   Done.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] corleyma commented on a change in pull request #7803: ARROW-9517: [C++/Python] Add optional session_token to S3Options

2020-08-10 Thread GitBox


corleyma commented on a change in pull request #7803:
URL: https://github.com/apache/arrow/pull/7803#discussion_r467821545



##
File path: python/pyarrow/_s3fs.pyx
##
@@ -81,9 +81,9 @@ cdef class S3FileSystem(FileSystem):
 cdef:
 CS3FileSystem* s3fs
 
-def __init__(self, *, access_key=None, secret_key=None, anonymous=False,
- region=None, scheme=None, endpoint_override=None,
- bint background_writes=True):
+def __init__(self, *, access_key=None, secret_key=None, session_token=None,

Review comment:
   updated docstring accordingly, thanks for catching that.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org