Apache Hadoop qbt Report: branch2.10+JDK7 on Linux/x86

2020-05-20 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/692/

No changes


[Error replacing 'FILE' - Workspace is not accessible]

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2020-05-20 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/148/

[May 20, 2020 2:39:40 AM] (Yiqun Lin) HDFS-15340. RBF: Implement 
BalanceProcedureScheduler basic framework. Contributed by Jinglun.
[May 20, 2020 6:06:52 AM] (pjoseph) YARN-9606. Set sslfactory for 
AuthenticatedURL() while creating LogsCLI#webServiceClient.
[May 20, 2020 12:42:25 PM] (Steve Loughran) HADOOP-16900. Very large files can 
be truncated when written through the S3A FileSystem.
[May 20, 2020 4:23:56 PM] (Eric Yang) YARN-10228. Relax restriction of file 
path character in yarn.service.am.java.opts.
[May 20, 2020 6:51:48 PM] (noreply) HADOOP-17004. Fixing a formatting issue
[May 21, 2020 1:07:23 AM] (noreply) HDFS-15353. Use sudo instead of su to allow 
nologin user for secure DataNode (#2018)


[Error replacing 'FILE' - Workspace is not accessible]

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Re: [DISCUSS] Secure Hadoop without Kerberos

2020-05-20 Thread Eric Yang
See my comments inline:

On Wed, May 20, 2020 at 4:50 PM Rajive Chittajallu  wrote:

> On Wed, May 20, 2020 at 1:47 PM Eric Yang  wrote:
> >
> >> > Kerberos was developed decade before web development becomes popular.
> >> > There are some Kerberos limitations which does not work well in
> Hadoop.  A
> >> > few examples of corner cases:
> >>
> >> Microsoft Active Directory, which is extensively used in many
> organizations,
> >> is based on Kerberos.
> >
> >
> > True, but with rise of Google and AWS.  OIDC seems to be a formidable
> standard that can replace Kerberos for authentication.  I think providing
> an option for the new standard is good for Hadoop.
> >
>
> I think you are referring to Oauth2 and adoption across varies
> significantly across vendors. When one refers to Kerberos, its mostly
> about MIT Kerberos or Microsoft Active Directory. But Oauth2 is a
> specification, implementations vary and are quite prone to bugs. I
> would be very careful in making a generic statement as a "formidable
> standard".
>
> AWS services, atleast in the context of Data processing / Analytics
> does not support Oauth2. Its more of a GCP thing. AWS uses Signed
> requests [1].
>
> [1] https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html


Kerberos is a protocol for authentication.  OIDC is also an authentication
protocol.  MIT Kerberos or Oauth2 are frameworks, not authentication
protocol.  By no means that I am suggesting to adopt Oauth2 framework
because implementing according to protocol spec is better than hard wired
to a certain libraries.  We can adopt existing OIDC libraries like pac4j to
reduce maintenance of implementing OIDC protocol in Hadoop.  AWS has been
offering OIDC authentication for EKS, and IAM identity provider.  By
offering native OIDC support, it will help Hadoop to access cloud services
that are secured by OIDC more easily.


>
>
>>
> >> > 1. Kerberos principal doesn't encode port number, it is difficult to
> know
> >> > if the principal is coming from an authorized daemon or a hacker
> container
> >> > trying to forge service principal.
> >>
> >> Clients use ephemeral ports. Not sure of what the relevancy of this
> statement.
> >
> > Hint: CVE-2020-9492
> >
>
> Its a reserved one. You can help the conversation by describing a threat
> model.
>

Hadoop security mailing list has the problem listed, if you are interested
in this area.  Hadoop Kerberos security quirks is a off topic for
decoupling Kerberos from Hadoop.


> >> > 2. Hadoop Kerberos principals are used as high privileged principal,
> a form
> >> > of credential to impersonate end user.
> >>
> >> Principals are identities of the user. You can make identities fully
> qualified,
> >> to include issuing authority if you want to. This is not kerberos
> specific.
> >>
> >> Remember, Kerberos is an authentication mechanism, How those assertions
> >> are translated to authorization rules are application specific.
> >>
> >> Probably reconsider alternatives to auth_to_local rules.
> >
> >
> > Trust must be validated.  Hadoop Kerberos principals for service that
> can perform impersonation are equal to root power.  Transport root power
> securely without being intercepted is quite difficult, when services are
> running as root instead of daemons.  There is alternate solution to always
> forward signed end user token, hence, there is no need of validation of
> proxy user credential.  The down side of forwarding signed token is
> difficult to forward multiple tokens of incompatible security mechanism
> because renewal mechanism and expiration time may not be deciphered by the
> transport mechanism.  This is the reason that using SSO token is a good way
> to ensure every libraries and framework abide by same security practice to
> eliminate confused deputy problems.
>
> Trust of what? Service principals should not be used for
> authentication in client context, there
> are there for server identification.


The trust is referring to service (Oozie/Hive) impersonates as end user,
and namenode issues delegation token after check proxy user ACL, The form
of token presented to namenode is a service tgt, not end user tgt.  The
service tgt is validated in proxy user ACL validation with namenode to
allow impersonation to happen.  If service tgt is intercepted due to lack
of encryption in RPC or HTTP transport, service ticket is vulnerable to
replay attack.


>
>
OAuth2 (which OIDC flow is based on) suggests JWT, which are signed
> tokens. Can you
> elaborate more on what do you mean my "SSO Token"?


SSO Token is JWT token in this context.  My advice is there should only be
one token transported, instead of multiple tokens to prevent out of sync
expiration date problem on multiple tokens.


>  To improve security for doAS use cases, add context to the calls. Just
> replacing

Kerberos with a different authentication mechanism is not going to
> solve the problem.


The focus is to support alternate security mechanism that may have been

Re: [DISCUSS] Secure Hadoop without Kerberos

2020-05-20 Thread Rajive Chittajallu
On Wed, May 20, 2020 at 1:47 PM Eric Yang  wrote:
>
>> > Kerberos was developed decade before web development becomes popular.
>> > There are some Kerberos limitations which does not work well in Hadoop.  A
>> > few examples of corner cases:
>>
>> Microsoft Active Directory, which is extensively used in many organizations,
>> is based on Kerberos.
>
>
> True, but with rise of Google and AWS.  OIDC seems to be a formidable 
> standard that can replace Kerberos for authentication.  I think providing an 
> option for the new standard is good for Hadoop.
>

I think you are referring to Oauth2 and adoption across varies
significantly across vendors. When one refers to Kerberos, its mostly
about MIT Kerberos or Microsoft Active Directory. But Oauth2 is a
specification, implementations vary and are quite prone to bugs. I
would be very careful in making a generic statement as a "formidable
standard".

AWS services, atleast in the context of Data processing / Analytics
does not support Oauth2. Its more of a GCP thing. AWS uses Signed
requests [1].

[1] https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html

>>
>> > 1. Kerberos principal doesn't encode port number, it is difficult to know
>> > if the principal is coming from an authorized daemon or a hacker container
>> > trying to forge service principal.
>>
>> Clients use ephemeral ports. Not sure of what the relevancy of this 
>> statement.
>
> Hint: CVE-2020-9492
>

Its a reserved one. You can help the conversation by describing a threat model.

>> > 2. Hadoop Kerberos principals are used as high privileged principal, a form
>> > of credential to impersonate end user.
>>
>> Principals are identities of the user. You can make identities fully 
>> qualified,
>> to include issuing authority if you want to. This is not kerberos specific.
>>
>> Remember, Kerberos is an authentication mechanism, How those assertions
>> are translated to authorization rules are application specific.
>>
>> Probably reconsider alternatives to auth_to_local rules.
>
>
> Trust must be validated.  Hadoop Kerberos principals for service that can 
> perform impersonation are equal to root power.  Transport root power securely 
> without being intercepted is quite difficult, when services are running as 
> root instead of daemons.  There is alternate solution to always forward 
> signed end user token, hence, there is no need of validation of proxy user 
> credential.  The down side of forwarding signed token is difficult to forward 
> multiple tokens of incompatible security mechanism because renewal mechanism 
> and expiration time may not be deciphered by the transport mechanism.  This 
> is the reason that using SSO token is a good way to ensure every libraries 
> and framework abide by same security practice to eliminate confused deputy 
> problems.

Trust of what? Service principals should not be used for
authentication in client context, there
are there for server identification.

OAuth2 (which OIDC flow is based on) suggests JWT, which are signed
tokens. Can you
elaborate more on what do you mean my "SSO Token"?

To improve security for doAS use cases, add context to the calls. Just replacing
Kerberos with a different authentication mechanism is not going to
solve the problem.

And how to improve Proxy User usecases vary by application. Asserting
a 'on-behalf-of' action,
when there is an active client on the other end (eg: hdfs proxy) would
be different from one that
is initiated per schedule, eg Oozie.


>>
>> > 3. Delegation token may allow expired users to continue to run jobs long
>> > after they are gone, without rechecking if end user credentials is still
>> > valid.
>>
>> Delegation tokens are hadoop specific implementation, whose lifecycle is
>> outside the scope of Kerberos. Hadoop (NN/RM) can periodically check
>> respective IDP Policy and revoke tokens. Or have a central token
>> management service, similar to KMS
>>
>> > 4.  Passing different form of tokens does not work well with cloud provider
>> > security mechanism.  For example, passing AWS sts token for S3 bucket.
>> > There is no renewal mechanism, nor good way to identify when the token
>> > would expire.
>>
>> This is outside the scope of Kerberos.
>>
>> Assuming you are using YARN, making RM handle S3 temp credentials,
>> similar to HDFS delegation tokens is something to consider.
>>
>> > There are companies that work on bridging security mechanism of different
>> > types, but this is not primary goal for Hadoop.  Hadoop can benefit from
>> > modernized security using open standards like OpenID Connect, which
>> > proposes to unify web applications using SSO.   This ensure the client
>> > credentials are transported in each stage of client servers interaction.
>> > This may improve overall security, and provide more cloud native form
>> > factor.  I wonder if there is any interested in the community to enable
>> > Hadoop OpenID Connect integration work?
>>
>> End to end identity assertion is where 

Re: [DISCUSS] Secure Hadoop without Kerberos

2020-05-20 Thread Rajive Chittajallu
On Wed, May 6, 2020 at 3:32 PM Eric Yang  wrote:
>
> Hi all,
>
> Kerberos was developed decade before web development becomes popular.
> There are some Kerberos limitations which does not work well in Hadoop.  A
> few examples of corner cases:

Microsoft Active Directory, which is extensively used in many organizations,
is based on Kerberos.

> 1. Kerberos principal doesn't encode port number, it is difficult to know
> if the principal is coming from an authorized daemon or a hacker container
> trying to forge service principal.

Clients use ephemeral ports. Not sure of what the relevancy of this statement.

> 2. Hadoop Kerberos principals are used as high privileged principal, a form
> of credential to impersonate end user.

Principals are identities of the user. You can make identities fully qualified,
to include issuing authority if you want to. This is not kerberos specific.

Remember, Kerberos is an authentication mechanism, How those assertions
are translated to authorization rules are application specific.

Probably reconsider alternatives to auth_to_local rules.

> 3. Delegation token may allow expired users to continue to run jobs long
> after they are gone, without rechecking if end user credentials is still
> valid.

Delegation tokens are hadoop specific implementation, whose lifecycle is
outside the scope of Kerberos. Hadoop (NN/RM) can periodically check
respective IDP Policy and revoke tokens. Or have a central token
management service, similar to KMS

> 4.  Passing different form of tokens does not work well with cloud provider
> security mechanism.  For example, passing AWS sts token for S3 bucket.
> There is no renewal mechanism, nor good way to identify when the token
> would expire.

This is outside the scope of Kerberos.

Assuming you are using YARN, making RM handle S3 temp credentials,
similar to HDFS delegation tokens is something to consider.

> There are companies that work on bridging security mechanism of different
> types, but this is not primary goal for Hadoop.  Hadoop can benefit from
> modernized security using open standards like OpenID Connect, which
> proposes to unify web applications using SSO.   This ensure the client
> credentials are transported in each stage of client servers interaction.
> This may improve overall security, and provide more cloud native form
> factor.  I wonder if there is any interested in the community to enable
> Hadoop OpenID Connect integration work?

End to end identity assertion is where Kerberos in it self does not address.
But any implementation should not pass "credentials'. Need a way to pass
signed requests, that could be verified along the chain.

>
> regards,
> Eric

On Wed, May 6, 2020 at 3:32 PM Eric Yang  wrote:
>
> Hi all,
>
> Kerberos was developed decade before web development becomes popular.
> There are some Kerberos limitations which does not work well in Hadoop.  A
> few examples of corner cases:
>
> 1. Kerberos principal doesn't encode port number, it is difficult to know
> if the principal is coming from an authorized daemon or a hacker container
> trying to forge service principal.
> 2. Hadoop Kerberos principals are used as high privileged principal, a form
> of credential to impersonate end user.
> 3. Delegation token may allow expired users to continue to run jobs long
> after they are gone, without rechecking if end user credentials is still
> valid.
> 4.  Passing different form of tokens does not work well with cloud provider
> security mechanism.  For example, passing AWS sts token for S3 bucket.
> There is no renewal mechanism, nor good way to identify when the token
> would expire.
>
> There are companies that work on bridging security mechanism of different
> types, but this is not primary goal for Hadoop.  Hadoop can benefit from
> modernized security using open standards like OpenID Connect, which
> proposes to unify web applications using SSO.   This ensure the client
> credentials are transported in each stage of client servers interaction.
> This may improve overall security, and provide more cloud native form
> factor.  I wonder if there is any interested in the community to enable
> Hadoop OpenID Connect integration work?
>
> regards,
> Eric

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [EXTERNAL] Re: [DISCUSS] Secure Hadoop without Kerberos

2020-05-20 Thread Craig . Condit
I have to strongly disagree with making UGI.doAs() private. Just because you 
feel that impersonation isn't an important feature, does not make it so for all 
users. There are many valid use cases which require impersonation, and in fact 
I consider this to be one of the differentiating features of the Hadoop 
ecosystem. We make use of it heavily to build a variety of services which would 
not be possible without this. Also consider that in addition to gateway 
services such as Knox being broken by this change, you would also cripple job 
schedulers such as Oozie. Running workloads on YARN as different users is vital 
to ensure that queue resources are allocated and accounted for properly as well 
as file permissions enforced. Without impersonation, all users of a cluster 
would need to be granted access to talk directly to YARN. Higher level access 
points or APIs would not be possible.

Craig Condit


From: Eric Yang 
Sent: Wednesday, May 20, 2020 1:57 PM
To: Akira Ajisaka 
Cc: Hadoop Common 
Subject: [EXTERNAL] Re: [DISCUSS] Secure Hadoop without Kerberos

Hi Akira,

Thank you for the information.  Knox plays a main role in reverse proxy for
Hadoop cluster.  I understand the importance to keep Knox running to
centralize audit log for ingress into the cluster.  Other reverse proxy
solution like Nginx are more feature rich for caching static contents and
load balancer.  It would be great to have ability to use either Knox or
Nginx as reverse proxy solution.  Company wide OIDC is likely to run
independently from Hadoop cluster, but also possible to run in a Hadoop
cluster.  Reverse proxy must have ability to redirects to OIDC where
exposed endpoint is appropriate.

HADOOP-11717 was a good effort to enable SSO integration except it is
written to extend on Kerberos authentication, which prevents decoupling
from Kerberos a reality.  I gathered a few design requirements this
morning, and welcome to contribute:

1.  Encryption is mandatory.  Server certificate validation is required.
2.  Existing token infrastructure for block access token remains the same.
3.  Replace delegation token transport with OIDC JWT token.
4.  Patch token renewer logic to support renew token with OIDC endpoint
before token expires.
5.  Impersonation logic uses service user credentials.  New way to renew
service user credentials securely.
6.  Replace Hadoop RPC SASL transport with TLS because OIDC works with TLS
natively.
7.  Command CLI improvements to use environment variables or files for
accessing client credentials

Downgrade the use of UGI.doAs() to private of Hadoop.  Service should not
run with elevated privileges unless there is a good reason for it (i.e.
loading hive external tables).
I think this is good starting point, and feedback can help to turn these
requirements into tasks.  Let me know what you think.  Thanks

regards,
Eric

On Tue, May 19, 2020 at 9:47 PM Akira Ajisaka  wrote:

> Hi Eric, thank you for starting the discussion.
>
> I'm interested in OpenID Connect (OIDC) integration.
>
> In addition to the benefits (security, cloud native), operating costs may
> be reduced in some companies.
> We have our company-wide OIDC provider and enable SSO for Hadoop Web UIs
> via Knox + OIDC in Yahoo! JAPAN.
> On the other hand, Hadoop administrators have to manage our own KDC
> servers only for Hadoop ecosystems.
> If Hadoop and its ecosystem can support OIDC, we don't have to manage KDC
> and that way operating costs will be reduced.
>
> Regards,
> Akira
>
> On Thu, May 7, 2020 at 7:32 AM Eric Yang  wrote:
>
>> Hi all,
>>
>> Kerberos was developed decade before web development becomes popular.
>> There are some Kerberos limitations which does not work well in Hadoop.  A
>> few examples of corner cases:
>>
>> 1. Kerberos principal doesn't encode port number, it is difficult to know
>> if the principal is coming from an authorized daemon or a hacker container
>> trying to forge service principal.
>> 2. Hadoop Kerberos principals are used as high privileged principal, a
>> form
>> of credential to impersonate end user.
>> 3. Delegation token may allow expired users to continue to run jobs long
>> after they are gone, without rechecking if end user credentials is still
>> valid.
>> 4.  Passing different form of tokens does not work well with cloud
>> provider
>> security mechanism.  For example, passing AWS sts token for S3 bucket.
>> There is no renewal mechanism, nor good way to identify when the token
>> would expire.
>>
>> There are companies that work on bridging security mechanism of different
>> types, but this is not primary goal for Hadoop.  Hadoop can benefit from
>> modernized security using open standards like OpenID Connect, which
>> proposes to unify web applications using SSO.   This ensure the client
>> credentials are transported in each stage of client servers interaction.
>> This may improve overall security, and provide more cloud native form
>> factor.  I wonder if there is 

Re: [DISCUSS] Secure Hadoop without Kerberos

2020-05-20 Thread Eric Yang
Hi Akira,

Thank you for the information.  Knox plays a main role in reverse proxy for
Hadoop cluster.  I understand the importance to keep Knox running to
centralize audit log for ingress into the cluster.  Other reverse proxy
solution like Nginx are more feature rich for caching static contents and
load balancer.  It would be great to have ability to use either Knox or
Nginx as reverse proxy solution.  Company wide OIDC is likely to run
independently from Hadoop cluster, but also possible to run in a Hadoop
cluster.  Reverse proxy must have ability to redirects to OIDC where
exposed endpoint is appropriate.

HADOOP-11717 was a good effort to enable SSO integration except it is
written to extend on Kerberos authentication, which prevents decoupling
from Kerberos a reality.  I gathered a few design requirements this
morning, and welcome to contribute:

1.  Encryption is mandatory.  Server certificate validation is required.
2.  Existing token infrastructure for block access token remains the same.
3.  Replace delegation token transport with OIDC JWT token.
4.  Patch token renewer logic to support renew token with OIDC endpoint
before token expires.
5.  Impersonation logic uses service user credentials.  New way to renew
service user credentials securely.
6.  Replace Hadoop RPC SASL transport with TLS because OIDC works with TLS
natively.
7.  Command CLI improvements to use environment variables or files for
accessing client credentials

Downgrade the use of UGI.doAs() to private of Hadoop.  Service should not
run with elevated privileges unless there is a good reason for it (i.e.
loading hive external tables).
I think this is good starting point, and feedback can help to turn these
requirements into tasks.  Let me know what you think.  Thanks

regards,
Eric

On Tue, May 19, 2020 at 9:47 PM Akira Ajisaka  wrote:

> Hi Eric, thank you for starting the discussion.
>
> I'm interested in OpenID Connect (OIDC) integration.
>
> In addition to the benefits (security, cloud native), operating costs may
> be reduced in some companies.
> We have our company-wide OIDC provider and enable SSO for Hadoop Web UIs
> via Knox + OIDC in Yahoo! JAPAN.
> On the other hand, Hadoop administrators have to manage our own KDC
> servers only for Hadoop ecosystems.
> If Hadoop and its ecosystem can support OIDC, we don't have to manage KDC
> and that way operating costs will be reduced.
>
> Regards,
> Akira
>
> On Thu, May 7, 2020 at 7:32 AM Eric Yang  wrote:
>
>> Hi all,
>>
>> Kerberos was developed decade before web development becomes popular.
>> There are some Kerberos limitations which does not work well in Hadoop.  A
>> few examples of corner cases:
>>
>> 1. Kerberos principal doesn't encode port number, it is difficult to know
>> if the principal is coming from an authorized daemon or a hacker container
>> trying to forge service principal.
>> 2. Hadoop Kerberos principals are used as high privileged principal, a
>> form
>> of credential to impersonate end user.
>> 3. Delegation token may allow expired users to continue to run jobs long
>> after they are gone, without rechecking if end user credentials is still
>> valid.
>> 4.  Passing different form of tokens does not work well with cloud
>> provider
>> security mechanism.  For example, passing AWS sts token for S3 bucket.
>> There is no renewal mechanism, nor good way to identify when the token
>> would expire.
>>
>> There are companies that work on bridging security mechanism of different
>> types, but this is not primary goal for Hadoop.  Hadoop can benefit from
>> modernized security using open standards like OpenID Connect, which
>> proposes to unify web applications using SSO.   This ensure the client
>> credentials are transported in each stage of client servers interaction.
>> This may improve overall security, and provide more cloud native form
>> factor.  I wonder if there is any interested in the community to enable
>> Hadoop OpenID Connect integration work?
>>
>> regards,
>> Eric
>>
>


Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2020-05-20 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/147/

[May 19, 2020 3:45:54 AM] (noreply) HADOOP-17004. ABFS: Improve the ABFS driver 
documentation
[May 19, 2020 5:27:12 AM] (noreply) HADOOP-17024. ListStatus on ViewFS root (ls 
"/") should list the linkFallBack root (configured target root). Contributed by 
Abhishek Das.
[May 19, 2020 5:36:36 AM] (Surendra Singh Lilhore) MAPREDUCE-6826. Job fails 
with InvalidStateTransitonException: Invalid event: JOB_TASK_COMPLETED at 
SUCCEEDED/COMMITTING. Contributed by Bilwa S T.
[May 19, 2020 7:30:07 PM] (noreply) Hadoop-17015. ABFS: Handling Rename and 
Delete idempotency
[May 19, 2020 11:47:04 PM] (noreply) HADOOP-16586. ITestS3GuardFsck, others 
fails when run using a local metastore. (#1950)


[Error replacing 'FILE' - Workspace is not accessible]

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-16900) Very large files can be truncated when written through S3AFileSystem

2020-05-20 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-16900.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

in trunk; rebuilding and retesting branch-3.3 with it too

> Very large files can be truncated when written through S3AFileSystem
> 
>
> Key: HADOOP-16900
> URL: https://issues.apache.org/jira/browse/HADOOP-16900
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.2.1
>Reporter: Andrew Olson
>Assignee: Mukund Thakur
>Priority: Major
>  Labels: s3
> Fix For: 3.4.0
>
>
> If a written file size exceeds 10,000 * {{fs.s3a.multipart.size}}, a corrupt 
> truncation of the S3 object will occur as the maximum number of parts in a 
> multipart upload is 10,000 as specific by the S3 API and there is an apparent 
> bug where this failure is not fatal, and the multipart upload is allowed to 
> be marked as completed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17050) Add support for multiple delegation tokens in S3AFilesystem

2020-05-20 Thread Gabor Bota (Jira)
Gabor Bota created HADOOP-17050:
---

 Summary: Add support for multiple delegation tokens in 
S3AFilesystem
 Key: HADOOP-17050
 URL: https://issues.apache.org/jira/browse/HADOOP-17050
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Gabor Bota
Assignee: Gabor Bota


In {{org.apache.hadoop.fs.s3a.auth.delegation.AbstractDelegationTokenBinding}} 
the {{createDelegationToken}} should return a list of tokens.
With this functionality, the {{AbstractDelegationTokenBinding}} can get two 
different tokens at the same time.
{{AbstractDelegationTokenBinding.TokenSecretManager}} should be extended to 
retrieve secrets and lookup delegation tokens (use the public API for 
secretmanager in hadoop)




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org