[ 
https://issues.apache.org/jira/browse/HDDS-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16776521#comment-16776521
 ] 

Anu Engineer commented on HDDS-1043:
------------------------------------

[~ajayydv] Thanks for working on this patch. I have truly reviewed only 2 
files. Primarily the AWSV4AuthParser.java and the parent interface. I have a 
bunch of comments, so I am leaving them here, and I can add my comments on 
other parts of the code later.

1. I am going to mix some style advice with the code review.

Here is a section in the code:

{code}
 auth = StringUtils.split(headerMap.getFirst(AUTHORIZATION_HEADER), ',');
{code}

The reader of the code finds it hard to parse this, or even judge if what you 
are doing is correct since the format is not known.

This line is followed by which looks suspicious since you log an error message, 
but continue.
The reader of the code is left wondering if this is a coding error, by design 
etc. 

 {code}
 if (auth.length < 3) {
   LOG.error("Authorization header in unexpected format. Auth header:{}",
       headerMap.get(AUTHORIZATION_HEADER));
 }

 authParser = new AuthorizationHeaderV4(
     headerMap.getFirst(AUTHORIZATION_HEADER));

{code}

So the next thing the reader has to do is to look the link that is the top of 
the source code comments.
Unfortunately, that page does not link has no information about this header.

I would like to propose that we rewrite this expression as follows:


{code}
    // According to AWS sigv4 documentation, authorization header should be
    // in following format.
    // Authorization: algorithm Credential=access key ID/credential scope,
    // SignedHeaders=SignedHeaders, Signature=signature

    // StringToSign =
    //    Algorithm + \n +
    //    RequestDateTime + \n +
    //    CredentialScope + \n +
    //    HashedCanonicalRequest

parseAuthorizationHeader(Map headerMap) {
        Precondition.notNull(headerMap);
        String auth = headerMap.getFirst(AUTHORIZATION_HEADER);
        // --> Error check here needed. 
        // This check is missing in the current code --
        if(StringUtils.isNullorEmpty(auth)) {
                return Error;
        }

        String [] authFields = StringUtils.split(auth, ','');
        if (auth.authFields < AUTH_FIELD_COUNT) {
      LOG.error("Authorization header in unexpected format. Auth header:{}",
          headerMap.get(AUTHORIZATION_HEADER));
     // --> Error, Return needed here. We cannot continue.
    }

    // Feel free to return an AuthorizationHeaderV4 here,
    // but let us not parse this string again. We have broken this up already 
into
    // it parts. Parsing a plain string is slow, dangerous and error-prone.
    // Since we have already done that, let us go and use the parsed results to 
    // build an AuthorizationHeaderV4 header here.

}
{code}


2. I think the stingToSign function is better renamed as parse.


3. AWSV4AuthParser#getStringToSign --
        3.1 First Field in the V4 signature can have one and only one value 
according to Amazon.
                        {code}
                        algorithm = auth[0].substring(0, auth[0].indexOf(" "));
                        {code}
        Hence I propose a simple trim function followed by a comparison to the 
only supported Algorithm
        Something like
        Here is the reference that I am using for this Signature.
        
https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-auth-using-authorization-header.html
        {code}
         algorithm = auth[0].trim();
         if (!algorith.equals(AWS_V4_ALGO)) {
                // --->Error. There is no point in going forward since the 
Algorithm does
                // not match the only know Algorithm function.

                // when we support other signature versions, this field should 
be split on '-'
                // so we can extract the signature version independently.
         }

Unfortunately, it is not that simple, since the next line, 
Credential=AKIAIOSFODNN7EXAMPLE/20130524/us-east-1/s3/aws4_request
is only separated by space. So what we really need to do is to tokenize the 
auth[0], and then trim the first value.
So you would end up rewriting this part as
        {code}
                String [] creds = auth[0].split("\\s");
                algorithm = creds[0];
                if(!algorithm.equals ...)
        {code}

4. I am going to presume that the comment above is not correct, Line 81 in 
AWSV4AuthParser.
Since the next line should be a key=value pair which should look like this.

>From  
>https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-auth-using-authorization-header.html
{code}
Credential=AKIAIOSFODNN7EXAMPLE/20130524/us-east-1/s3/aws4_request
{code}

Since we just did the creds split in the above line, let us write a function 
that handles the parsing of 
credentials

some thing like :
{code}
parseCredentials(String credential) {
// A credential is a key value pair that looks like this.
// Credential=AKIAIOSFODNN7EXAMPLE/20130524/us-east-1/s3/aws4_request

if(StringUtils.isNullOrEmpty(credential)) {
        // ---> Error : Let us throw another error here.
}
String [] credKeyValue = credential.split("=");
if(!credKeyValue[0].equals("Credential")) { 
// Please use a symbolic constant for same code style consistency 
// ---> Let throw another error. This value should have been there in the 
header.
}

if(StingUtils.isNullorEmpty(credKeyValue[1])) {
        // if this is not or empty
        // ---> Let us throw another error.
        // Please remember, you are parsing an HTTP header, which is untrusted,
        // therefore; it pays rich divindents to be delibrate in each step.
}

{code}
_
// So at this point we know we have a string that is the payload of the 
Credential field.
// Let us write another function to parse the that field.
// Here is what Amazon says it looks like 
// <your-access-key-id>/<date>/<aws-region>/<aws-service>/aws4_request
// Where:
//    <date> value is specified using YYYYMMDD format.
//  <aws-service> value is s3 when sending request to Amazon S3._

{code}
parseCredPayload(String creds) {
        // We are going to assume that all the fields are required.
        //  That means that a split of this string should yield 
        // 5 fields.

        String [] credFields = creds.split('/');
        if (credfields.length != CRED_FIELDS_LENGTH) {
                // Ok, we have another failure here.
                // --> return error.
        }

        // Now let us verify each of the Format of the fields
        // Amazon access key ID is a well-known format, 
        // however, ozone does not follow that format. So
        //  we need to skip over the first field in silence.

        if(StringUtils.isNullOrEmpty(credFields[0]) {
        // This is not correct 
        // -->Error,let us return an error here.
        }

        // Now let us verify the time stamp
        // Verify the YYYYMMDD is the current day 
        // all time is in UTC , so the issue is if the 
        // client is little behind or ahead and is a day
        // behind or ahead.

        // So we will have to verify that the date is either 
        //yesterday, today or tomorrow.
        // All other dates should be rejected and an error should be returned.

        // Aws Region -- we will not have a consistent value for this.
        // we can ask our clients to configure any region and send the request 
to 
        // us, with the only constraint that this is non-null or some valid AWS 
region.

        // AWS service -- has to "S3". Everything else should fail.



}

You get the idea, we need to make sure that our parser truly parses *untrusted* 
data from the internet. *That is the uber point.*

Even if you are looking at this place --  
https://docs.aws.amazon.com/general/latest/gr/sigv4-create-string-to-sign.html
the signature should look very different. Btw, that is not where you have to 
look for parsing the incoming request from amazon client.



> Enable token based authentication for S3 api
> --------------------------------------------
>
>                 Key: HDDS-1043
>                 URL: https://issues.apache.org/jira/browse/HDDS-1043
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>            Reporter: Ajay Kumar
>            Assignee: Ajay Kumar
>            Priority: Major
>              Labels: security
>             Fix For: 0.4.0
>
>         Attachments: HDDS-1043.00.patch, HDDS-1043.01.patch, 
> HDDS-1043.02.patch, HDDS-1043.03.patch
>
>
> Ozone has a  S3 api and mechanism to create S3 like secrets for user. This 
> jira proposes hadoop compatible token based authentication for S3 api which 
> utilizes S3 secret stored in OM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to