Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20447 )

Change subject: IMPALA-13039: AES Encryption/ Decryption Support in Impala
......................................................................


Patch Set 24:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/20447/22/be/src/util/openssl-util-test.cc
File be/src/util/openssl-util-test.cc:

http://gerrit.cloudera.org:8080/#/c/20447/22/be/src/util/openssl-util-test.cc@67
PS22, Line 67:     // Check all the modes
> Have added for 128 bit mode. For ECB we might have to add a new function. I
These are unit tests for the class EncryptionKey. We've added new functionality 
to this class, among others the ECB mode. It is not good practice to not extend 
the existing tests for new functionality in the same patch.

Also, the end to end tests don't test ECB encryption, only decryption, because 
we only want to support that for the Impala builtin functions. This class, 
however, can also encrypt with ECB, which I think is ok, but should be tested.
Another example: aes_encrypt() and aes_decrypt() never do encryption in-place 
(where the output buffer is the same as the input buffer) - this test does just 
that. Another example is encrypting buffers whose length is longer than what 
can be stored in a 32 bit integer.

Actually, when extending this test to include ECB, I caught a bug in the code 
around openssl-util.cc:L275, see there.

Extending this test to include ECB is not that difficult, we have to take care 
of the possibly different ciphertext size, but only a few lines need to be 
changed:

```
void TestEncryptionDecryption(const int64_t buffer_size) {
  vector<uint8_t> original(buffer_size);                                   
  // Scratch buffer for in-place encryption.            
  vector<uint8_t> scratch(buffer_size + AES_BLOCK_SIZE);
  if (buffer_size % 8 == 0) {
    GenerateRandomData(original.data(), buffer_size);                     
  } else {                                               
    GenerateRandomBytes(original.data(), buffer_size);
  }                  
                                                                               
  // Check all the modes                                      
  AES_CIPHER_MODE modes[] = {              
      AES_CIPHER_MODE::AES_256_GCM,
      AES_CIPHER_MODE::AES_256_CTR,                                             
    
      AES_CIPHER_MODE::AES_256_CFB,                                
      AES_CIPHER_MODE::AES_256_ECB,               
      AES_CIPHER_MODE::AES_128_GCM,
      AES_CIPHER_MODE::AES_128_ECB,                                             
           
  };                                                                      
  for (auto m : modes) {                                 
    memcpy(scratch.data(), original.data(), buffer_size);
                   
    EncryptionKey key;                                                       
    if (m == AES_CIPHER_MODE::AES_256_GCM || m == AES_CIPHER_MODE::AES_256_CTR 
||
        m == AES_CIPHER_MODE::AES_256_CFB || m == AES_CIPHER_MODE::AES_256_ECB) 
{
      key.InitializeRandom(32, AES_BLOCK_SIZE);                                 
  
    } else {                                                     
      key.InitializeRandom(16, AES_BLOCK_SIZE);
    }                        
    key.SetCipherMode(m);                                                       
     
                                                                    
    int64_t encrypted_length;                      
    ASSERT_OK(key.Encrypt(scratch.data(), buffer_size, scratch.data(), 
&encrypted_length));
                                                                                
      
    // Check that encryption did something                           
    ASSERT_NE(0, memcmp(original.data(), scratch.data(), buffer_size));
    ASSERT_OK(key.Decrypt(scratch.data(), encrypted_length, scratch.data()));
    // Check that we get the original data back.                               
    ASSERT_EQ(0, memcmp(original.data(), scratch.data(), buffer_size));
  }                                      
}                                                                               
                                                                                
              
```


http://gerrit.cloudera.org:8080/#/c/20447/24/be/src/util/openssl-util-test.cc
File be/src/util/openssl-util-test.cc:

http://gerrit.cloudera.org:8080/#/c/20447/24/be/src/util/openssl-util-test.cc@67
PS24, Line 67:     // Check all the modes
> nit: this isn't really all the modes, since it doesn't include ECB. Can we
I think we should also test ECB encryption if the EncryptionKey class supports 
it. Its user, the aes_encrypt() function doesn't use it, but the class 
interface includes this feature so somebody may start using it later.

The alternative is to disable ECB encryption also in this class - that is also 
ok for me, but I don't think that's necessary. Extending the existing unit 
tests may be easier if we keep encryption.


http://gerrit.cloudera.org:8080/#/c/20447/24/be/src/util/openssl-util-test.cc@79
PS24, Line 79:           m == AES_CIPHER_MODE::AES_256_CFB || m == 
AES_CIPHER_MODE::AES_256_ECB) {
> ECB is never used.
See my reply to L67.


http://gerrit.cloudera.org:8080/#/c/20447/24/be/src/util/openssl-util.h
File be/src/util/openssl-util.h:

http://gerrit.cloudera.org:8080/#/c/20447/24/be/src/util/openssl-util.h@176
PS24, Line 176: In other modes,
Nit: "In other modes" is no longer needed.


http://gerrit.cloudera.org:8080/#/c/20447/22/be/src/util/openssl-util.cc
File be/src/util/openssl-util.cc:

http://gerrit.cloudera.org:8080/#/c/20447/22/be/src/util/openssl-util.cc@197
PS22, Line 197: void EncryptionKey::InitializeRandom(int key_len, int iv_len) {
> Done
Have you considered this?
```
On the other hand, starting from this change, key length is dependent on the 
operation mode, so maybe setting the operation mode could also be part of 
initialization, instead of a separate SetCipherMode() method.
```
SetCipherMode() is new in this change, so no existing uses would be affected. 
If we included the op mode in initialisation, that would be one less 
opportunity to use this class inconsistently. Then the 'key_len' wouldn't have 
to be provided as a parameter, it could be calculated from the operation mode.

If they are kept separately, like now, it's possible to configure it with 
key_len=16 and then to try to use it with GCM_256 for example.


http://gerrit.cloudera.org:8080/#/c/20447/22/be/src/util/openssl-util.cc@231
PS22, Line 231:
> So should I include ECB in the comment for encryption? Wouldn't that lead t
Yes, ECB should be included. It won't cause confusion because this class 
supports also encryption with ECB. It is a different matter that aes_encrypt() 
doesn't allow that, but this class does.


http://gerrit.cloudera.org:8080/#/c/20447/24/be/src/util/openssl-util.cc
File be/src/util/openssl-util.cc:

http://gerrit.cloudera.org:8080/#/c/20447/24/be/src/util/openssl-util.cc@266
PS24, Line 266:   // The OpenSSL encryption APIs use INT for buffer lengths. To 
support larger buffers,
I don't think this works well with ECB. If we have for example two chunks, an 
extra padding block will be inserted after the first chunk. Because we're also 
chunking the input during decryption, we'll be able to decrypt data that we 
encrypted, but this is not standard and may not be compatible with other tools.
For this patch I think it's enough if we disable ECB encryption and decryption 
for inputs larger than the max value of an 'int', we don't need to support that 
now. To be precise, the output length is also stored in an int, so the output 
should not be allowed to be longer, and the output can be AES_BLOCK_SIZE longer 
than the input in case of encryption (but not decryption). So the input should 
be at most "numeric_limits<int>::max() - AES_BLOCK_SIZE" for encryption and at 
most "numeric_limits<int>::max()" for decryption.

The length could be checked at the beginning of this function. Note that it 
only affects ECB, so other encryption modes should be ok.


http://gerrit.cloudera.org:8080/#/c/20447/24/be/src/util/openssl-util.cc@275
PS24, Line 275: input
Should be 'output_offset'.


http://gerrit.cloudera.org:8080/#/c/20447/24/be/src/util/openssl-util.cc@277
PS24, Line 277: input
Should be 'output_offset'.



--
To view, visit http://gerrit.cloudera.org:8080/20447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3902f2b1d95da4d06995cbd687e79c48e16190c9
Gerrit-Change-Number: 20447
Gerrit-PatchSet: 24
Gerrit-Owner: Pranav Lodha <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Pranav Lodha <[email protected]>
Gerrit-Comment-Date: Mon, 04 Nov 2024 16:46:19 +0000
Gerrit-HasComments: Yes

Reply via email to