Thank you Martin again!

Here's the updated webrev without the lower-case control char ids:

http://cr.openjdk.java.net/~igerasim/8230365/03/webrev/

I've also filed a CSR to record the changes in bahavior:

https://bugs.openjdk.java.net/browse/JDK-8230675

Could you please help review it?


On 9/4/19 9:00 PM, Martin Buchholz wrote:
Thanks, Ivan.  We're mostly in agreement.

+     * If {@code true} then lower-case control-character ids are mapped to the
+     * their upper-case counterparts.
Extra "the".

After all these decades I only now realize that c ^= 0x40 moves '?' to the end of the ASCII range and all the other controls to the start!

Should we support lower-case controls?  Compatibility with perl regex still matters, but a lot less than in 2003.  But the key is that we got the WRONG ANSWER previously, so when we restrict the control ids let's just make lower case controls syntax errors.  Silently changing behavior is bad for users. ... so let's abandon ALLOW_LOWERCASE_CONTROL_CHAR_IDS.
An alternative:
int ch = read() ^ 0x40;
if (!RESTRICTED_CONTROL_CHAR_IDS || ch < 0x20 || ch == 0x7f) return ch;



This code will probably be most efficient for the common case.

However, I'd prefer to use the auxiliary method isCntrlId() in this case, as it is self-documenting and still efficient enough.

With kind regards,

Ivan


Reply via email to