On Fri, 8 Nov 2024 16:48:33 GMT, Lothar Kimmeringer <[email protected]> wrote:
>> - I think we shouldn't care if someone wants to use other regex syntax
>> matching, maybe someone will find it useful. We just not going to document
>> this to avoid any confusion, most people will just use `*`. They should be
>> able to use other regex (with `*` in place of `.*`) as long as at least one
>> `*` is present and cipher suite starts with "TLS_". Filtering the pattern to
>> disallow this will result in one extra regex matching operation while we try
>> to keep things fast.
>> - About just ignoring other regex characters with `Pattern.compile("^\\Q" +
>> p.replace("*", "\\E.*\\Q") + "\\E$")`: I'm not sure if silently ignoring
>> those characters is what we want, we should either disallow them (throw an
>> exception) or allow those characters.
>> - I think the other design change we should consider is to use full regex
>> without any `*` replacement. Most people who would be setting those values
>> should be familiar with regex.
>> - Yet another option is to implement custom `*` matching method and not to
>> use regex at all.
>> - About case-insensitive matching: actually that was the original version of
>> this code, but other implementations treat cipher suites in case-sensitive
>> manner, so I modified it to be case-sensitive.
>
>> * I think we shouldn't care if someone wants to use other regex syntax
>> matching, maybe someone will find it useful. We just not going to document
>> this to avoid any confusion, most people will just use `*`.
>
> `*` isn't valid regex (which is why there is this conversion "under the
> hood") and as a user I don't expect that regex can be used if the
> documentation only mentions wildcards as they are used for globbing in e.g.
> bash. So as a user I might expect `?` for a single character to work but
> would realize quite fast that its use doesn't make much sense in the context
> of cipher-filtering.
>
>> They should be able to use other regex (with `*` in place of `.*`) as long
>> as at least one `*` is present and cipher suite starts with "TLS_".
>
> I expect funny effects with `*` being used in a different context than `.*`,
> e.g. `\\d*` (any number of digits). This would be internally converted to
> `\\d.*` which represents something else completely (a single digit, followed
> by any number of characters).
>
>> Filtering the pattern to disallow this will result in one extra regex
>> matching operation while we try to keep things fast.
>
> How often is this executed? My understanding is that this is happening during
> the startup of the VM, so one additional regex operation per cipher shouldn't
> have a big impact over an application's overall performance.
>
>> About just ignoring other regex characters with `Pattern.compile("^\\Q" +
>> p.replace("*", "\\E.*\\Q") + "\\E$")`: I'm not sure if silently ignoring
>> those characters is what we want, we should either disallow them (throw an
>> exception) or allow those characters.
>
> The support of regex isn't documented, so using special characters in the
> pattern shouldn't lead to an exception (be it because the pattern happen to
> be an invalid regex or because you throw one).
- Good point, thanks. After thinking about it I tend to agree. Also, the
colleague of mine suggested that cipher suites might have dots in the future
("TLS_1.3_*"). I'l make a change to adopt your suggestion.
- To answer your question: this code is executed on every handshake for every
supported cipher suite or algorithm.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/21841#discussion_r1834880577