Hi all,

hash_hkdf() discussion was mess, but it came to conclusion finally a while
ago.
Apparently, there are confused readers still. I discovered it in other
thread.
I would like to make it clear again what's wrong in the discussion.
If you think I was wrong about HKDF, you should read this.


Nikita and Adnrey's Opinion:
 1. Keys("IKM") must be strong always
     (with very limited exception? Usage example is not disclosed)
 2. From 1, "Salt" is pure optional parameter
 3. HKDF is designed for specific purpose
     (crypto keys only? I suppose)

Therefore, "Salt" is the last optional parameter and hash_hkdf() returns
BINARY.


I assume Andrey realized his misunderstanding finally and would not repeat
this ridiculous discussion. Andrey, you should realize when you couldn't
show
us any valid usage example. However, I appreciate discussion because I
couldn't
find out where you misunderstood exactly  and why you insisted above
without
discussion.

Anyway, their statements make none of sense with HKDF specifications.
It is easy to accuse "You are wrong" or "You don't understand" regardless
of
validity. It requires a lot of work to make it clear what's really
correct/incorrect.
Those who are still do not understand what the HKDF RFC is about,  please
read carefully what  the RFC states.



What HKDF RFC says:

The algorithm: https://tools.ietf.org/html/rfc5869#section-2.2
----
HKDF-Extract(salt, IKM) -> PRK
PRK = HMAC-Hash(salt, IKM)
----
>From this algorithm alone, HKDF is clearly expecting weak keys("IKM").
i.e. Remember how hash_hmac() works.


About "Salt": https://tools.ietf.org/html/rfc5869#section-3.1
----
   HKDF is defined to operate with and without random salt.  This is
   done to accommodate applications where a salt value is not available.
   We stress, however, that the use of salt adds significantly to the
   strength of HKDF, ensuring independence between different uses of the
   hash function, supporting "source-independent" extraction, and
   strengthening the analytical results that back the HKDF design.
----
It is clear that salt is made optional only for apps that cannot use salt.
It is clear that HKDF recommends "Salt" regardless of key("IKM") strength.
"Salt" is required for  "source-independent" extraction. i.e. Make IKM
entropy/strength irrelevant and derived key stronger.


Omitting "Salt": https://tools.ietf.org/html/rfc5869#section-3.3
----
   In some applications, the input key material IKM may already be
   present as a cryptographically strong key.  In this case, one can skip
the
   extract part and use IKM directly to key HMAC in the expand step.
----
It says "Salt" may be omitted __only when__ keys("IKM") are strong already.
In other words, "Salt" is mandatory for not cryptographically strong
keys("IKM").

e.g. $rsa256key = hash_hkdf("sha256", $rsa128key) is misuse, _NOT_ textbook
HKDF usage example as Nikita mentioned.
hash_hkdf("sha256", $rsa128key, 0, '', $random256salt) is correct/designed
usage.



Reason Why Nikita and Andrey misunderstood the RFC:

They ignored previously explained descriptions and cherry picked irrelevant
statements in the RFC.

Application of HKDF: https://tools.ietf.org/html/rfc5869#section-4
-----
   On the other hand, it is anticipated that some applications will not
   be able to use HKDF "as-is" due to specific operational requirements,
   or will be able to use it but without the full benefits of the
   scheme.  One significant example is the derivation of cryptographic
   keys from a source of low entropy, such as a user's password.  The
   extract step in HKDF can concentrate existing entropy but cannot
   amplify entropy.  In the case of password-based KDFs, a main goal is
   to slow down dictionary attacks using two ingredients: a salt value,
   and the intentional slowing of the key derivation computation.  HKDF
   naturally accommodates the use of salt; however, a slowing down
   mechanism is not part of this specification.
-----

Especially this part

-----
   the derivation of cryptographic
   keys from a source of low entropy, such as a user's password.  The
   extract step in HKDF can concentrate existing entropy but cannot
   amplify entropy.
-----

It says HKDF cannot be used with low entropy/weak keys("IKM"). This supports
their opinion.

However, these are explanations for __"Invalid/Wrong HKDF usage"__ as
"Password
hashing like PBKDF2". These are __irrelevant__ for "Valid/Correct HKDF
usage" at
all. i.e. This basically says "Hey, you shouldn't use HKDF for password
hashing "as-is".
HKDF is not password hashing that requires stretching.".

Key derivation and password hashing is fundamentally different operation
because "Salt" in password hashing is _non secret_ always by design.
Password hashing requires entropy amplification(stretching), but key
derivation does not require amplification by HMAC/cryptographic hash
characteristics.

Even with weakest password, password(IKM) is protected  and OKM is secure
because HKDF's "Salt" is  designed to be "secret" or "non secret" as
explicitly
described in the RFC. Secret salt makes keys are secured. This is obvious
from the algorithm, i.e. Remember how hash_hmac() works again.


Lastly,
Application of HKDF: https://tools.ietf.org/html/rfc5869#section-4
states in the first sentence.
-----
HKDF is intended for use in a wide variety of KDF applications.
-----
It is obvious that HKDF is general purpose KDF. The RFC even explains
non KDF usage as CSPRNG.



My Comments:
>From the algorithm alone, it is very clear to me that HKDF is expecting
weak keys, and
salt is used for additional entropy/complexity for key security, and salt
should be used
always.

It was very difficult for me to understand what Nikita and Andrey were
insisting and
referring, because their discussion was very confusing and makes none of
sense
with HKDF RFC.(and HMAC, crypto hash characteristics)

Current API
  - encourages insecure misuse, "Salt" is mandatory always with the
exception
    when salt cannot be used.
  - extremely insecure, weak keys are vulnerable without "secret Salt"
    i.e. Who would use hash_hmac() without keys?

In addition, it has completely unnecessary API inconsistency with respect
to other
hash functions.

Nikita and Andrey, don't you have something to say about this mess, do you?

IMO, you should ask to the list by yourself so that your ridiculous
hash_hkdf() API
will not be kept forever because of your misunderstanding. If I were you, I
would
not leave behind such shameful API that would be laughed at by everyone.

RMs, if I were you, I will consider fixing the API seriously. Promoting
insecure
_key_ derivations forever is nonsense.

This would be the last post for this issue, unless there are people who
still don't
understand what HKDF is.

Regards,

P.S.
No one really read the internet RFC, this is another issue...
Session security and input data security discussion have similar pattern.
I insists based on standards and/or guidelines, others just ignore
standards/guidelines/security because they don't use/are not familiar with
them.
If it is simply a matter of preference, it would be acceptable. However,
in case of security related issues, this is not a healty technical
discussion
at all.


--
Yasuo Ohgaki
yohg...@ohgaki.net

On Wed, Sep 6, 2017 at 10:15 AM, Yasuo Ohgaki <yohg...@ohgaki.net> wrote:

> Hi all,
>
> This is the last recommendation for hash_hkdf[1]. In fact,
> this would be the last chance to fix because we'll have 7.2 soon.
> The issue is secure usage and API consistency.
>
> Currently hash_hkdf() has following signature:
>
> <binary string> hash_hkdf(string $algo , string $ikm [, int $length = 0 [,
> string $info = '' [, string $salt = '' ]]] )
>
> These are rationals behind recommendation. There are many, but
> please read on.
>
> === Parameter Order ===
>
> HKDF[2] algorithm is:
>  1. General purpose key derivation function as per RFC 5869
>  2. "$salt" parameter is a "pre-shared _KEY_" in many cases as mentioned
> RFC 5869
>  3. "$salt" (or preshared key) is very strongly recommended for security
> season as per RFC 5869
>  4. Supplying salt that the same length of input key does not affect
> performance as per RFC 5969
>  5. "$info" is what makes HKDF useful, but it's less important as
> described in RFC 5869
>  6. "$length" is truly an optional parameter for very specific encryption
> algorithm or usage.
>
> Rationale for change:
>  1. Key derivations without secondary key ($salt) does not make sense when
>      secondary key could be available. HKDF is designed for best possible
> key
>      security with the key. Not using secondary key ($salt) simply
> downgrades
>      key security without no reason. i.e. HKDF performance is the same
>      when $salt has the same as hash is set.
>  2. HKDF is based on HMAC. When $info has no use, HMAC would be the best
>      choice for it. i.e. $newkey = hash_hmac($ikm, $key);
>  3. It does not make sense violating RFC recommendations for a RFC
> implementation.
>
> From these facts and reasons, $salt, $info and $length parameter order and
> requirement should be changed from
>
> string $algo , string $ikm [, int $length = 0 [, string $info = '' [,
> string $salt = '' ]]]
>
> to
>
> string $algo , string $ikm , string $salt, string $info = '' [,
> int $length = 0 ]
> Note: Users can set empty string if they really don't need $salt and/or
> $info.
>
> Conclusion:
> This way, users would have better chances to use hash_hkdf() more securely
> and
> properly.
>
> [1] http://php.net/hash_hkdf
> [2] http://www.faqs.org/rfcs/rfc5869.html
>
> === Return Value and Output Option ===
>
> The most common HKDF usage with PHP would be:
>  1. CSRF token generation that is specific to a request with expiration
> time.
>      (HEX return value would be appropriate, not BINARY)
>  2. API access token generation that does not transmit "The API Key", but
>      derived key by HKDF. It also should have expiration time.
>      (HEX return value would be appropriate, not BINARY)
>
> Consistency with other hash_*() functions:
>  1. All of other hash_*()  returns HEX string hash value.
>  2. hash_hkdf() is the only one returns BINARY hash value.
>
> Conclusion:
> hash_hkdf() should return HEX by default, not BINARY.
> Optional [, bool $raw_output = false ] should be added just like other
> hash_*().
>
>
> === Compatibility ===
>
> IMHO, current hash_hkdf() should not be added by PHP 7.1.2, but 7.2.0
> in the first place. The mess could be resolved by 7.2.
>
> Anyway, hash_hkdf() is added 7.1.2. Hopefully not many users are
> using it yet. If we change API with 7.2 release, there would be least
> possible confusions. (We may remove it from 7.1 to avoid confusions, too)
>
> Our choices:
>  - Keep the current insecure/inconsistent API forever.
>  - Change the API to have secure/consistent API forever.
>
> Conclusion:
> No conclusion for this. There would be conflicting options.
>
>
> I strongly think it is not worth to keep this insecure/inconsistent API
> forever.
> I prefer to change the API to what it should be.
>
> What should we do for this?
> Comments?
>
>
>
> P.S.
>
> Nikita, you've said following during HKDF discussion:
>  - HKDF for CSRF tokens/etc does not make sense at all.
>  - Current parameter order and return value makes perfect sense
>    and has valid/common usages.
>  - Everyone one this list, shouldn't listen to me because I'm insane and
>    totally misunderstood what the HKDF is.
>
> Phrases are not the exact, but it should be correct enough. Some part
> is my fault, w/o reading mail and/or poor English. I blindly assumed
> you've understand RFC 5869 and possible common usages with PHP. I
> apologized for the confusion and tried to explain why w/o success.
>
> If you still believe what you've said is correct, I don't request you to
> take these back, show us at least one common/reasonable hash_hkdf()
> usage example for current API, and point out what's wrong my
> recommendations
> and rationales above.
>
> If not, I request you to take it back. I respect your contributions much,
> but
> the last one you've said is out of tolerance.
>
> --
> Yasuo Ohgaki
> yohg...@ohgaki.net
>
>

Reply via email to