Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20447 )

Change subject: IMPALA-13039: AES Encryption/ Decryption Support in Impala
......................................................................


Patch Set 17:

(21 comments)

http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1810
PS13, Line 1810: _thresho
> Done
You haven't addressed this part of Noemi's comment:

 > "other modes" or "CTR and CFB mode."
 >
 > You describe CTR and CFB later. It would be nice to list here what
 > parameters are required for each mode.
 > Also, encryption in ECB mode is not supported, so it is a bit
 > confusing to list here what parameters are necessary for ECB
 > encryption.
 > I agree that it is useful to include a detailed description about
 > the modes. I would put this long comment above the aes_decrypt
 > function since that is supported for all modes. That way the
 > comment aligns better with the code.
 > And just a suggestion: you can switch the order of encrypt and
 > decrypt functions like this:
 >
 > //long comment about all supported modes.
 > aes_decrypt()
 >
 > //short comment, ECB is not supported.
 > aes_encrypt()

AES-ECB is still listed here, above the aes_encrypt() function.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1866
PS13, Line 1866: inters in th
> Done, we fallback to other modes in case mode is null, added logs for it as
Also include the mode string (which turned out to be invalid) in the log.


http://gerrit.cloudera.org:8080/#/c/20447/15/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/20447/15/be/src/exprs/string-functions-ir.cc@2134
PS15, Line 2134:   result.len = out_len;
Now that AES_BLOCK_SIZE is only added to the buffer length for ECB, is this 
step still necessary?


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/exprs/string-functions-ir.cc@1936
PS17, Line 1936: and optional AAD
Where can AAD be added?


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/exprs/string-functions-ir.cc@1991
PS17, Line 1991: encryption_key.
> style nit (if applicable to Impala's code): consider an explicit notation o
Yes, for static methods we should use the explicit static syntax.


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/exprs/string-functions-ir.cc@2058
PS17, Line 2058: and optional AAD
Where can AAD be provided?


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/exprs/string-functions-ir.cc@2059
PS17, Line 2059: s
expect


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/exprs/string-functions-ir.cc@2087
PS17, Line 2087: en
Decryption would be better.


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/exprs/string-functions-ir.cc@2124
PS17, Line 2124: encryption_key.IsEcbMode() ?
The conditional is only needed for the 'len' parameter, so this could be

StringVal result = StringVal(ctx, encryption_key.IsEcbMode() ?len + 
AES_BLOCK_SIZE : len);


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/exprs/string-functions-ir.cc@2134
PS17, Line 2134:   result.len = out_len;
Now that we've confirmed that the additional AES_BLOCK_SIZE is only needed for 
ECB, I think this can be replaced with
DCHECK_EQ(result.len, out_len).


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/util/openssl-util.h
File be/src/util/openssl-util.h:

http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/util/openssl-util.h@71
PS17, Line 71: ,
Use "." instead of ",".


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/util/openssl-util.h@71
PS17, Line 71: e
Capital "E": "Enum".


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/util/openssl-util.h@139
PS17, Line 139: /// plaintext and the AAD. The AAD itself is not required or 
won't change the security.
Now, compared to the original state, we're missing "and GCM also protects the 
confidentiality of the plaintext". No need to remove that.


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/util/openssl-util.h@159
PS17, Line 159:   /// 'gcm_tag_' variable).
'out_len' should be mentioned in the comment:
  "'out_len' (if not NULL) will be set to the output length."
See also for Decrypt() and EncryptInternal().


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/util/openssl-util.h@195
PS17, Line 195: expr
Why do you call it 'expr'? 'tag' would be better. Don't forget to rename it in 
the .cc file as well.


http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc
File be/src/util/openssl-util.cc:

http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@223
PS13, Line 223:   int padding_flag = padding_enabled ? 1 : 0;
> Please refer to Michael's comment on L233 on PS10.
Ok.


http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/util/openssl-util.cc
File be/src/util/openssl-util.cc:

http://gerrit.cloudera.org:8080/#/c/20447/17/be/src/util/openssl-util.cc@227
PS17, Line 227: numerous modes
What other modes? In the original comment only CTR is mentioned. Also, what is 
the reason for adding "particularly" on L229? It feels like this paragraph was 
rewritten by an AI, but that's not good:
 - it introduces a "literary" style that is unnecessary here
 - it adds imprecise details like the ones I pointed out, altering the original 
meaning of the text.

This is a technical text, it should be precise and concise, in contrast to a 
literary work.

Please go back to the original text and add changes where it is appropriate, 
i.e. where the behaviour of the code has also changed.


http://gerrit.cloudera.org:8080/#/c/20447/13/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/20447/13/common/function-registry/impala_functions.py@520
PS13, Line 520: [[
> Did you miss addressing this nit or the indent is supposed to be like this?
The extra spaces should be removed.


http://gerrit.cloudera.org:8080/#/c/20447/17/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/20447/17/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3456
PS17, Line 3456: select aes_decrypt(aes_encrypt('ABC', 
'12345678901234567890123456789012','AES_256_GCM','1234567890123456'),
Shouldn't this be under the "mode not specified" queries after L3585?.


http://gerrit.cloudera.org:8080/#/c/20447/17/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3586
PS17, Line 3586: ''
Shouldn't the default mode be used when the key is NULL, as opposed to being an 
empty string? Where is the empty string converted to NULL?


http://gerrit.cloudera.org:8080/#/c/20447/17/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3601
PS17, Line 3601: # Encryption/ decryption with expr as NULL.
Many test queries do not explicitly specify the AES mode. Except for the tests 
specifically checking the fallback, the AES mode should be provided explicitly.



--
To view, visit http://gerrit.cloudera.org:8080/20447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3902f2b1d95da4d06995cbd687e79c48e16190c9
Gerrit-Change-Number: 20447
Gerrit-PatchSet: 17
Gerrit-Owner: Pranav Lodha <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Pranav Lodha <[email protected]>
Gerrit-Comment-Date: Mon, 26 Aug 2024 13:12:52 +0000
Gerrit-HasComments: Yes

Reply via email to