> Op 22 aug. 2014, om 18:31 heeft Oleg Kalnichevski <[email protected]> het 
> volgende geschreven:
> 
> On Fri, 2014-08-22 at 12:47 +0200, Dirk-Willem van Gulik wrote:
>>>> Found that some of below are indeed able to hang the regex stack (e.g. # 
>>>> 2). However the more elaborate regex-es are blocked by:
>>>> 
>>>>    private final static Pattern WILDCARD_PATTERN = Pattern.compile( 
>>>> "^[a-z0-9\\-\\*]+(\\.[a-z0-9\\-]+){2,}$", Pattern.CASE_INSENSITIVE);
>>>>            ..
>>>>            WILDCARD_PATTERN.matcher(identity).matches()
>>>> 
>>>> which we apply to the subjectAltName, CN, etc. So that is not too bad then 
>>>> - assuming that that regep does not let them through. Which is likely - as 
>>>> the only dangerous thing I see in there is a *.
>>>> 
>>> 
>>> Thank you so much for your feedback. What I could do is validate both
>>> the identity and the subjectAltName pattern by making sure they consist
>>> of characters legal for domain names (alphanumeric, dash and asterisk in
>>> case of subjectAltName) prior to doing regexp matching with them.
>> 
>> Right - but I am wondering if that means we end up in a rear guard battle. 
>> As we then find IPv6 addresses containing ‚:’ and god knows what new TLDs 
>> may do 5+ years hence.
>> 
> 
> 5+ is pretty much my retirement target ;-) 
> 
> Seriously, though, I would worry about UTF8 issues only once start
> getting angry complaints from users. Right now I would rather be too
> restrictive than too liberal.
> 
>> Now *all* that is allowed are ‚*’ — and as far as I know - only in string 
>> (and not IPv4/IPv6) based entries.
>> 
>> So perhaps it is an option to compare things from the TLD down with a very 
>> very simple loop.
>> 
>>      if (starts with a star) then
>>              @a = array of FQDN split on ‚.'
>>              @b = array of FQDN split on ‚.’
>> 
>>              if not right lenghts - bail
>>              working from the topmost side working to last but one
>>                      bail if not the same.
>>              check if we have left just one entry on a and a wildcard on b.
>> 
>> i.e. avoid wildcards completely.
> 
> Please correct me if I am wrong but after rereading relevant RFCs I was
> under impression that complex wild card expressions in subjectAltName
> like 
> 
> a*b*c*d.mydomain.com <http://d.mydomain.com/>
> 
> were perfectly legal. This was the primary reason why I felt the use of
> regex matching was beneficial. Should we revert to supporting simple
> '*', 'blah*' expressions only?

Not sure - doing more research after reading the RFC’s - they are much more 
strict about i18n domains; and I am not sure if I understand all the 
implications.

Dw.

Reply via email to