[
https://issues.apache.org/jira/browse/HADOOP-14556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621018#comment-16621018
]
Steve Loughran commented on HADOOP-14556:
-----------------------------------------
Patch 007. still a WiP
* For people who want to review this, the code is also [on
github|https://github.com/steveloughran/hadoop/tree/s3/HADOOP-14556-delegation-token].
This fairly complex design is intended to
* support different back-end token bindings
* and leave it open for anyone who ever does Kerberos binding (as Wasb permits)
to do so.
Supported bindings
* Full: your normal AWS secrets. Should work with non-AWS S3 services.
* Session: session tokens are requested off STS
* Role. This is the complex one, but the most significant. Ask for a restricted
role with a configured role ARN and a dynamically created role policy
restricted purely to the bucket & DDB table used by the FS (there's some
interfaces there to let them tell the token binding what those policies are).
Example:
{code}
2018-09-19 19:15:10,324 [JUnit-testDTFileSystem] DEBUG auth.STSClientFactory
(STSClientFactory.java:requestRole(181)) - Requesting role
arn:aws:iam::11111111111:role/stevel-s3guard with duration 21600; policy = {
"Version" : "2012-10-17",
"Statement" : [ {
"Sid" : "7",
"Effect" : "Allow",
"Action" : [ "s3:GetBucketLocation", "s3:ListBucket" ],
"Resource" : "arn:aws:s3:::hwdev-steve-ireland-new"
}, {
"Sid" : "8",
"Effect" : "Allow",
"Action" : [ "s3:Get*", "s3:PutObject", "s3:DeleteObject",
"s3:AbortMultipartUpload", "s3:ListMultipartUploadParts", "s3:ListBucket*" ],
"Resource" : "arn:aws:s3:::hwdev-steve-ireland-new/*"
}, {
"Sid" : "1",
"Effect" : "Allow",
"Action" : [ "kms:Decrypt", "kms:GenerateDataKey" ],
"Resource" : "arn:aws:kms:*"
}, {
"Sid" : "9",
"Effect" : "Allow",
"Action" : [ "dynamodb:BatchGetItem", "dynamodb:BatchWriteItem",
"dynamodb:DeleteItem", "dynamodb:DescribeTable", "dynamodb:GetItem",
"dynamodb:PutItem", "dynamodb:Query", "dynamodb:UpdateItem" ],
"Resource" :
"arn:aws:dynamodb:eu-west-1:00000000000:table/hwdev-steve-ireland-new"
} ]
}
{code}
This token can be passed on to a shared hive/spark cluster, knowing that the
maximum access anything with that token can have will be full R/W access to the
destination bucket and any S3Guard table
h3. Scale
There's some ILoad* tests to see what the sustainable rate of issuing STS
session and role tokens is.
The TSV datasets [are available for
download|https://github.com/steveloughran/datasets/releases/tag/tag_2018-09-17-aws]
and analysis in your favourite notebook. Any analysis + different results from
different locations would be great!
Key points:
# you can get about 500-1000 requests/second before calls get rejected.
# Calls to STS do need to catch & retry on throttle events in the case this
does occure.
For anyone planning those tests, you need to invoke them by name and set
-Dscale. Others users in your AWS account using the same STS endpoint may have
calls rejected for throttling too, which may be "observable". Test carefully by
selecting an explicit location and/or doing it in quiet periods.
h3. TODO
* if that token really does contain user info (i.e someone ever did kerberos
support), it should somehow be preserved. What to do?
* docs, obviously.
* I now know more about role permissions; improve our docs there too.
* FileContext tests are failing due to port mismatches in "canonical" paths.
hence the improved detail on the failing exception being raised ... issue is
still outstanding.
* S3a FS to pick up encryption settings from DT; will permit SSE-C to propagate
from client to shared service, in particular
* Some downstream tests in Hive & Spark. These only seem look for DTs if the
user has kerberos enabled.
> S3A to support Delegation Tokens
> --------------------------------
>
> Key: HADOOP-14556
> URL: https://issues.apache.org/jira/browse/HADOOP-14556
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.2.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
> Attachments: HADOOP-14556-001.patch, HADOOP-14556-002.patch,
> HADOOP-14556-003.patch, HADOOP-14556-004.patch, HADOOP-14556-005.patch,
> HADOOP-14556-007.patch, HADOOP-14556.oath-002.patch, HADOOP-14556.oath.patch
>
>
> S3A to support delegation tokens where
> * an authenticated client can request a token via
> {{FileSystem.getDelegationToken()}}
> * Amazon's token service is used to request short-lived session secret & id;
> these will be saved in the token and marshalled with jobs
> * A new authentication provider will look for a token for the current user
> and authenticate the user if found
> This will not support renewals; the lifespan of a token will be limited to
> the initial duration. Also, as you can't request an STS token from a
> temporary session, IAM instances won't be able to issue tokens.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]