[
https://issues.apache.org/jira/browse/NIFI-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394965#comment-15394965
]
Andy LoPresto commented on NIFI-1465:
-------------------------------------
A lot of additional context from this email thread [1].
{quote}
Brett,
(I added the dev list back in because this is probably of interest to someone/
should be documented. )
This started out as a brief email and spiraled into a couple days of work. Part
of that is because my Ruby is rusty, but part is because there are some serious
underlying issues here (previously quickly noted in NIFI-1465 [1] but not fully
expounded upon at the time). The tl;dr of this email is that I have written a
Ruby script which will accomplish what you want and it is located here [2].
Read on at your own risk (various parts of this email were written over the
course of the last two days, so it may be repetitive/incoherent where the story
changed).
This is a confusing case because it is unusual in execution and there are
multiple layers here. I think I didn’t explain it well last time. I’ll try to
step through it, but I also apologize in advance, because I found a lot of
legacy stuff here that was probably written with no intention of ever being
exposed/integrated with an external source. Kerckhoff would not approve. I will
add some of this to NIFI-1465.
The mechanism that NiFi uses to encrypt the sensitive property values (i.e.
passwords for EncryptContent) is as I described previously, but in further
investigating to help solve your problem, I realized that the Jasypt [3]
StandardPBEStringEncryptor used in StandardFlowSynchronizer uses a random salt
generator internally. You can verify this by making a new flow with two
EncryptContent processors — even if you set the same password for each, the
resulting cipher texts in the flow.xml will be unique because despite the same
master key being used, the random salt will cause them to be different.
Now this is actually good news, because it means the salt must be encoded and
transmitted with the cipher text. If it was not, NiFi would not be able to
decrypt these values unless it used a fixed salt, and clearly it does not. So
as long as your Ruby code generates a salt of the correct length and embeds it
in the cipher text, it will be compatible with NiFi. The salt is the first 16
bytes (32 hex characters) and the actual cipher text representing the encrypted
processor sensitive property (happens to be another password, but this is
irrelevant) is the second 16 bytes.
Note: because your initial plaintext (the password you are trying to encrypt)
is only 11 UTF-8 characters, it can be represented by 11 bytes. This means that
when encrypted using AES-CBC (16 byte block size), it requires only and exactly
one block (11 bytes of plaintext plus 5 bytes padding). The resulting cipher
text is 16 bytes. If the processor password was longer than 16 characters, it
would be encrypted in two blocks and encoded as 32 bytes, or 64 hex characters
alone (remember to add the initial 32 chars for the salt for a total of 96
chars).
What we are looking for as the output of the Ruby operation is, as you noted, a
64 character hex string, of the format:
output = hex_encode(salt || encrypt(processor_password, master_key, iv))
where the master_key and iv are derived by
(master_key, iv) = md5(master_passphrase || master_salt) ||
md5(md5(master_passphrase || master_salt) || master_passphrase || master_salt)
|| md5(md5(md5(…)…)…)
This is an unusual method and is described thusly on the OpenSSL EVP_BytesToKey
documentation [4]:
If the total key and IV length is less than the digest length and MD5 is used
then the derivation algorithm is compatible with PKCS#5 v1.5 otherwise a non
standard extension is used to derive the extra data.
...
KEY DERIVATION ALGORITHM
The key and IV is derived by concatenating D_1, D_2, etc until enough data is
available for the key and IV. D_i is defined as:
D_i = HASH^count(D_(i-1) || data || salt)
where || denotes concatenation, D_0 is empty, HASH is the digest algorithm in
use, HASH^1(data) is simply HASH(data), HASH^2(data) is HASH(HASH(data)) and so
on.
The initial bytes are used for the key and the subsequent bytes for the IV.
The reason there are multiple MD5 operations above is because we have specified
the encryption will use AES-256-CBC, which requires a 256 bit (32 byte) key —
32 bytes are represented by 64 hex characters. A single iteration of MD5 only
yields 16 bytes (32 hex chars), so we must concatenate it with another
invocation. However, as it is deterministic, running it on the same input would
return the same output, and the key would just repeat the same 16 bytes. To
counter this, the second ( up to n many) invocation “salts” the input with the
result of the previous step. If we substitute some variables for the full
expression above, we can see this more clearly:
let x = “master_passphrase || master_salt”
master_key = md5(master_passphrase || master_salt) || md5(md5(master_passphrase
|| master_salt) || master_passphrase || master_salt)
master_key = md5(x) || md5(md5(x) || x)
let y = “md5(x)”
master_key = y || md5(y || x)
Indeed, the necessary length of the output is greater, as we need another 16
bytes for the IV, so we continue with this series:
let z = “md5(y || x)”
iv = md5(z || y || x)
So to revisit the NiFi encryptor, it requires a random salt to be embedded at
the beginning of the cipher text so it can be split off before decryption to
seed the cipher object. To mimic its behavior in Ruby, we’ll have to match
those parameters.
Normally this would be simple. You would use the cipher.pkcs5_keyivgen method
to derive the encryption key and IV from the master passphrase and salt. You
would then perform the encryption normally, using AES-CBC, the key, and the IV,
and concatenate the salt with the cipher text, and be good to go.
Here is where it gets nasty.
As this feature was developed early in NiFi’s history, it leverages a library
called Jasypt, specifically a class called StandardPBEStringEncryptor [5]
wrapped in a local class StringEncryptor [6], to perform all encryption and
decryption. I cannot speak for the developer of Jasypt, but they decided to set
the salt size to the block size of whatever cipher was being used. For AES,
that means 16 bytes. This is not the worst idea in the world, but it has
serious consequences when used in conjunction with an algorithm that requires a
specific salt length. In our case, the OpenSSL EVP_BytesToKey method expects
(and enforces) an 8 byte salt. Jasypt does not expose a mechanism to provide a
custom salt length other than by injecting a new SaltGenerator [7]
implementation at initialization time, and this SaltGenerator is not aware of
the algorithm selected for the encryptor.
If we could intercept and override this value (which I was able to do via
Groovy reflection [8] and breaking Java access controls), we could set it to 8
bytes, so Jasypt would follow the EVP_BytesToKey implementation. However, we
cannot do this for NiFi itself (one, it would require fighting the intention of
the library & Java access controls, two, it would be a breaking change, as
every existing flow would be unable to decrypt any sensitive properties). When
the default protection scheme is improved in NIFI-1465, this will be addressed
using a migration tool.
But the result of that decision is that we cannot simply use the
cipher.pkcs5_keyivgen method that wraps all of that logic to generate the key
and IV in a “standard” way.
At this point, I was lucky enough to come across Ola Bini’s work [9] in porting
the OpenSSL methods to JRuby. I was able to modify his implementation in Groovy
to handle an arbitrary salt length, and then translate that back to Ruby. It is
probably not the cleanest or most Ruby-idiomatic implementation because I
haven’t touched the language in a few years, so feel free to clean it up, but
it is functionally compatible with Jasypt and OpenSSL (for their respective
salt lengths).
You’ll have to adapt the Ruby scripts I provided to handle whatever your
key/salt/value input mechanisms are, but currently you can just edit the
script, populate the key, salt, and sensitive property values, and run the
script. The output is of the form “enc{abcdef…}” so you can immediately
populate your templates with it.
Because unit tests only provide confidence over the specific system under test,
I have verified this by making a flow which encrypted data using a key derived
from “password123” and decrypted the same data using a key derived from
“password456”. Both of these values were encrypted and written to the flow.xml
file. Obviously, the decryption was failing. I stopped NiFi, ran the script,
unzipped the flow.xml.gz to an XML file, copied the output into the second
processor properties, rezipped the XML file to flow.xml.gz, and restarted the
flow. The data was now successfully decrypted.
I can provide screenshots/logs if necessary.
I hope this helps. If you have further questions, please let me know.
[1] https://issues.apache.org/jira/browse/NIFI-1465
[2]
https://github.com/alopresto/nifi/blob/4309f9e9199ad2038c06eb8f2ef51d3ff0418e53/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/resources/openssl_3_brett_demo.rb
[3] http://www.jasypt.org/
[4] https://www.openssl.org/docs/manmaster/crypto/EVP_BytesToKey.html
[5]
http://svn.code.sf.net/p/jasypt/code/trunk/jasypt/src/main/java/org/jasypt/encryption/pbe/StandardPBEStringEncryptor.java
[6]
https://github.com/alopresto/nifi/blob/4309f9e9199ad2038c06eb8f2ef51d3ff0418e53/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/encrypt/StringEncryptor.java
[7]
http://svn.code.sf.net/p/jasypt/code/trunk/jasypt/src/main/java/org/jasypt/salt/SaltGenerator.java
[8]
https://github.com/alopresto/nifi/blob/4309f9e9199ad2038c06eb8f2ef51d3ff0418e53/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/test/groovy/org/apache/nifi/encrypt/StringEncryptorGroovyTest.groovy#L106
[9] https://olabini.com/blog/2006/10/openssl-in-jruby/
{quote}
[1]
https://lists.apache.org/thread.html/b93ced98eff6a77dd0a2a2f0b5785ef42a3b02de2cee5c17607a8c49@%3Cdev.nifi.apache.org%3E
> Upgrade encryption of sensitive properties
> ------------------------------------------
>
> Key: NIFI-1465
> URL: https://issues.apache.org/jira/browse/NIFI-1465
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 0.5.0
> Reporter: Andy LoPresto
> Assignee: Andy LoPresto
> Labels: encryption, security
> Original Estimate: 120h
> Remaining Estimate: 120h
>
> Currently, NiFi accepts a password and encryption algorithm in
> `nifi.properties` which are used to encrypt all sensitive processor
> properties throughout the application. The password defaults to empty and the
> algorithm defaults to {{PBEWITHMD5AND256BITAES-CBC-OPENSSL}}. This algorithm:
> * uses a digest function ({{MD5}}) which is not cryptographically secure
> [1][2][3][4]
> * uses a single iteration count [5][6]
> * limits password input to 16 characters on JVMs without the unlimited
> strength cryptographic jurisdiction policy files installed [NIFI-1255]
> all of which combine to make it extremely insecure. We should change the
> default algorithm to use a strong key derivation function (KDF) [7] which
> will properly derive a key to protect the sensitive properties.
> Because existing systems have already encrypted the properties using a key
> derived from the original settings, we should provide a translation/upgrade
> utility to seamlessly convert the stored values from the old password &
> algorithm combination to the new.
> [1] http://security.stackexchange.com/a/19908/16485
> [2] http://security.stackexchange.com/a/31846/16485
> [3]
> http://security.stackexchange.com/questions/52461/how-weak-is-md5-as-a-password-hashing-function
> [4] http://security.stackexchange.com/a/31410/16485
> [5] http://security.stackexchange.com/a/29139/16485
> [6] https://www.openssl.org/docs/manmaster/crypto/EVP_BytesToKey.html
> [7]
> https://cwiki.apache.org/confluence/display/NIFI/Key+Derivation+Function+Explanations
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)