On Mon, May 20, 2019, at 8:05 AM, Rasmus Schultz wrote:
> I found some issues with the description of keys in section 1.2 of the 
> PSR-16 spec.
> 
> First off:
> 
> "Implementing libraries MUST support keys consisting of the characters 
> A-Z, a-z, 0-9, _, and . in any order in UTF-8 encoding and a length of 
> up to 64 characters."
> 
> So the only characters that MUST be supported are ASCII characters - 
> but then, in the same sentence, UTF-8 encoding is stipulated?
> 
> If the only characters supported are ASCII characters, stipulating 
> UTF-8 encoding doesn't seem to make any sense.
> 
> Am I to understand that *if* the implementation supports more than the 
> required ASCII characters, it must use UTF-8 encoding?
> 
> And if the implementation *does* support UTF-8, then presumably the 
> stipulated minimum length is 64 Unicode runes? e.g. larger than the 64 
> bytes required to support 64 ASCII characters?
> 
> Secondly:
> 
> "Libraries are responsible for their own escaping of key strings as 
> appropriate, but MUST be able to return the original unmodified key 
> string"
> 
> How?
> 
> To my understanding, there's no API in the specification that returns keys.
> 
> So this clause seems unnecessary? How the implementation stores keys 
> internally, or whether it is able to recover them, doesn't seem like it 
> should be a concern as such?
> 
> (Possibly, this clause was relevant to PSR-6 and may have carried over 
> unintentionally?)
> 
> Thanks,
>  Rasmus

All of this language was carried over from PSR-6, yes.

For the first part, there are encodings beyond ASCII and UTF-8, even though 
they are not often seen in the western world, and some of them are incompatible 
with UTF-8/ASCII, even on lower glyphs.  For instance, UTF-16 and UTF-32 are 
incompatible with UTF-8, because they have a fixed width character rather than 
UTF-8's variable-width.

So an implementation that uses UTF-16 natively to store/interpret/return the 
key string is Doing It Wrong(tm), per PSR-6/16.  And yes, that means it may 
need more than 64 bytes if someone stores a UTF-8 Japanese glpyh or poop emoji 
as their cache key.  (I would reject a PR that does the latter, but it would 
technically be spec-compliant.)

For the second point, "return" is a bit misleading here, and is probably just a 
carry-over from PSR-6.  In practice it means that if I store a cache key with a 
poop emoji, then I should be able to reliably look it up with a poop-emoji key. 
 If the key is manged in storage such that I cannot look it up with the same 
key as it was stored with, then the implementation is Doing It Wrong(tm).

--Larry Garfield

-- 
You received this message because you are subscribed to the Google Groups "PHP 
Framework Interoperability Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to php-fig+unsubscr...@googlegroups.com.
To post to this group, send email to php-fig@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/php-fig/198fc077-c0f1-41ec-921e-6cf4d7f75951%40www.fastmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to