[
https://issues.apache.org/jira/browse/SPARK-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14192411#comment-14192411
]
Chris Fregly commented on SPARK-3640:
-------------------------------------
Agreed that this was no ideal when i first chose this implementation. And as
you mentioned, the NotSerializableException is exactly why I went with the
DefaultCredentialsProvider.
So I spent some time trying to solve this using AWS IAM Roles on separate users
under your root AWS account. This appears to work well with the existing
DefaultCredentialsProvider.
Is this a viable option for you?
Basically, every user would get their own ACCESS_KEY_ID and SECRET_KEY. This
would be used in place of the root credentials.
For thoroughness, I've included links to the instructions as well as an example
IAM Policy JSON (I'll also add this to the Spark Kinesis Developer Guide
(http://spark.apache.org/docs/latest/streaming-kinesis-integration.html):
Creating IAM users
http://docs.aws.amazon.com/IAM/latest/UserGuide/Using_SettingUpUser.html
https://console.aws.amazon.com/iam/home?#security_credential
Setting up Kinesis, DynamoDB, and CloudWatch IAM Policy for the new users
http://docs.aws.amazon.com/kinesis/latest/dev/kinesis-using-iam.html
IAM Policy Generator
http://awspolicygen.s3.amazonaws.com/policygen.html
Attaching the Custom Policy
https://console.aws.amazon.com/iam/home?#users
Select the user
Select Attach Policy
Select Custom Policy
IAM Policy JSON
This is already generated using the Policy Generator above... just fill
in the missing pieces specific to your environment.
{
"Statement": [
{
"Sid": "Stmt1414784467497",
"Action": "kinesis:*",
"Effect": "Allow",
"Resource":
"arn:aws:kinesis:<region-of-stream>:<aws-account-id>:stream/<stream-name>"
},
{
"Sid": "Stmt1414784693732",
"Action": "dynamodb:*",
"Effect": "Allow",
"Resource":
"arn:aws:dynamodb:us-east-1:<aws-account-id>:table/<dynamodb-tablename>"
},
{
"Sid": "Stmt1414785131046",
"Action": "cloudwatch:*",
"Effect": "Allow",
"Resource": "*"
}
]
}
Notes:
* The region of the DynamoDB table is intentionally hard-coded to us-east-1 as
this is how Kinesis currently works
* The DynamoDB table is the same as the application name of the Kinesis
Streaming Application. The sample included with the Spark distribution uses
KinesisWordCount for the application/table name.
Is this a sufficient workaround. Using IAM Policies is an AWS best practice,
but not sure if this aligns with your existing environment. If not, I can
continue to investigate exposing that CredentialsProvider
Lemme know, Aniket!
> KinesisUtils should accept a credentials object instead of forcing
> DefaultCredentialsProvider
> ---------------------------------------------------------------------------------------------
>
> Key: SPARK-3640
> URL: https://issues.apache.org/jira/browse/SPARK-3640
> Project: Spark
> Issue Type: Improvement
> Components: Streaming
> Affects Versions: 1.1.0
> Reporter: Aniket Bhatnagar
> Labels: kinesis
>
> KinesisUtils should accept AWS Credentials as a parameter and should default
> to DefaultCredentialsProvider if no credentials are provided. Currently, the
> implementation forces usage of DefaultCredentialsProvider which can be a pain
> especially when jobs are run by multiple unix users.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]