Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20447 )

Change subject: IMPALA-13039: AES Encryption/ Decryption Support in Impala
......................................................................


Patch Set 13:

(44 comments)

http://gerrit.cloudera.org:8080/#/c/20447/13//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20447/13//COMMIT_MSG@31
PS13, Line 31: Multiple AES modes (GCM, CTR, CFB, ECB) are supported
This could be misleading as ECB is only supported for reading.


http://gerrit.cloudera.org:8080/#/c/20447/13//COMMIT_MSG@32
PS13, Line 32: flexibility and compatibility with various use cases and OpenSSL 
features.
Line too long.


http://gerrit.cloudera.org:8080/#/c/20447/13//COMMIT_MSG@33
PS13, Line 33: AES-GCM is set as the default mode due to its strong security 
properties.
Line too long.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@23
PS9, Line 23: #include <openssl/evp.h>
            : #include <stdint.h>
> Sorry, these should be in their own include group because they are not std
Please address this comment.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@26
PS9, Line 26: #include <re2/re2.h>
> What is the string header used for?
Please address this comment.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1810
PS9, Line 1810:  key, AES
> Why do you call it "expression"? I think "input" was better.
Please address this comment. Don't forget to change the parameter name in the 
.h file as well, and also for aes_decrypt().


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1846
PS9, Line 1846:   }
> key is getting checked at 1851.
No, the NULL-ness of 'key' is not checked at L1851 (or L1849 in PS13). Also, as 
I wrote in the comment, the key being NULL should probably be an error.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1858
PS13, Line 1858:   if (mode.is_null)
Add { and } around the block.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1866
PS13, Line 1866: Invalid Mode
"Invalid AES mode" would be more informative. Also, including 'mode' would also 
be good.
Can this be invalid if 'mode' is NULL? If so, the log should make it clear if 
that is the case, i.e. whether the invalid mode comes from a NULL (lack of a 
user-provided value) or from a user-provided value.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1893
PS13, Line 1893: spaces
Nit: "space".


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1903
PS13, Line 1903: (uint8_t*)
> style nit: if using reinterpet_cast elsewhere, why not using it here?
No need to use a cast at all, but if you do, use reinterpret_cast instead, not 
C-style casts.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1912
PS13, Line 1912: (uint8_t*)
> ditto for the reinterpret_cast
No need for a cast.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1930
PS13, Line 1930:   if (expr.is_null || key.is_null) {
The key being NULL should be an error.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1947
PS13, Line 1947:   else {
No newline necessary, "else" should come after the closing "}".


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1978
PS13, Line 1978: (uint8_t*)
No need for the cast.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1982
PS13, Line 1982:   uint8_t out[len + AES_BLOCK_SIZE];
See comment about var-len arrays on the stack at PS9 L1886.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1982
PS13, Line 1982: AES_BLOCK_SIZE
When do we need to add this additional block size? Is it needed for all modes 
or only ECB?


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h
File be/src/util/openssl-util.h:

http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@153
PS13, Line 153: 'out
Nit: the 'out' buffer.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@153
PS13, Line 153: 'data'
Nit: the 'data' buffer.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@153
PS13, Line 153: is
"Should be" or "needs to be". Also, it can actually be bigger, so it needs to 
be "at least as big".


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@157
PS13, Line 157: data_read
Add single quotes: 'data_read'.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@157
PS13, Line 157: refers to the output length
This would be better: "'data_read' (if not NULL) will be set to the output 
length". On the other hand, 'out_len' would be better, as I suggested in my 
previous comment. Remember to rename it in the .cc file as well.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@162
PS13, Line 162: ///
> nit: extra '///'
See comments about the 'out' buffer above the Encrypt() function.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@168
PS13, Line 168: data_read
See L157.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@181
PS13, Line 181:
> nit: the indent seems to be off -- if that's a continuation of the line abo
Yes, continuation lines should be indented +4 relative to the line they 
continue.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@190
PS13, Line 190: It initializes key and iv
> nit: Initializes 'key' and 'iv'
Should actually be 'key_' and 'iv_', with underscores.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@193
PS13, Line 193: get_gcm_tag
Follow the naming convention: GetGcmTag().


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@194
PS13, Line 194: set_gcm_tag
Follow the naming convention: SetGcmTag().


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@211
PS13, Line 211: contains
See comment about output buffer length above the Encrypt() function.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@216
PS13, Line 216: data_read refers to the output length.
> Why not 'out_len' then?  What is the unit of length?  Bytes?  Would be grea
See L157.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h
File be/src/util/openssl-util.h:

http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@138
PS9, Line 138:
> Done
Not done. Also, why do you reword this comment? There is no need for it.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@145
PS9, Line 145:   /// tag for GCM mode. Reinitializes with new random values if 
the key was already
> I don't see the need for rewriting this paragraph.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@224
PS9, Line 224:   /// uninitialized keys.
> This should be right after L214.
Not done, 'key_length_' should come right after 'key_'.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc
File be/src/util/openssl-util.cc:

http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@77
PS9, Line 77: #if OPENSSL_VERSION_NUMBER < 0x10100000L
> It's just to eliminate we and us
No need to do that, changing this is unnecessary.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@345
PS9, Line 345: }
> No need to rewrite this paragraph.
Please revert it back to the original wording, no need to reword it.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@361
PS9, Line 361:     // class will gracefully fall back to CTR mode. While this 
fallback approach is
> No need to rewrite this paragraph.
No need to rewrite this paragraph.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc
File be/src/util/openssl-util.cc:

http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@223
PS13, Line 223:   int padding_flag = padding_enabled ? 1 : 0;
No need for this variable, bools are implicitly converted to 1 or 0.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@228
PS13, Line 228:
Nit: add full stop (".").


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@227
PS13, Line 227: all of which facilitate arbitrary length
              :   // ciphertexts
This is also not true because ECB necessitates a multiple of 16 bytes.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@271
PS13, Line 271:       *data_read += out_len;
> style nit: wrap this into {}, similar to the 'if' clause above?
If the whole IF statement can fit on one line, we don't put it into {}, but 
write the body on the same line. If there is an ELSE or ELSE IF, we use {} for 
all branches.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@297
PS13, Line 297: *data_read += final_out_len;
> style nit: wrap this into {}, similar to the 'if' clause above?
See L271.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@318
PS13, Line 318: AES_256_ECB
This line should come after AES_256_CFB, so that 256 bit and 128 bit modes are 
grouped together.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@415
PS13, Line 415:   }
              :   else if
> style nit: this doesn't seem to match with the rest of the style in the fil
Yes, the "else if" should come on the same line as the closing "}".


http://gerrit.cloudera.org:8080/#/c/20447/13/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/20447/13/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3315
PS13, Line 3315: +BIqtrEK9FpC/zrvpOycjQ==
Do we know why we have additional data compared to previous patches (like PS9 
or PS10)? Did something work incorrectly that has been fixed?



-- 
To view, visit http://gerrit.cloudera.org:8080/20447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3902f2b1d95da4d06995cbd687e79c48e16190c9
Gerrit-Change-Number: 20447
Gerrit-PatchSet: 13
Gerrit-Owner: Pranav Lodha <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Pranav Lodha <[email protected]>
Gerrit-Comment-Date: Wed, 07 Aug 2024 15:48:11 +0000
Gerrit-HasComments: Yes

Reply via email to