On Wed, Aug 30, 2017 at 05:44:01PM +0100, Robert Wilton wrote:
>
> First question: How many pattern statements in draft and standard IETF YANG
> modules actually use Unicode properties (e.g \p{}).
> Answer: Just 2. To add a zone at the end of the IPv4/IPv6 address.
>
> E.g. pattern
> '(([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.){3}'
> + '([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])'
> + '(%[\p{N}\p{L}]+)?';
>
> This could quite possibly have been written just as
> "\d{1,3}\.{3}\d{1,3)(%\w+)?" and not use Unicode properties at all.
Shorter but less precise. The thread started with a proposal to ban
\d, you seem to like it. Note that \d is not the same as [0-9] in
unicode as far as I know. \d is defined to be \p{Nd} and Nd has way
more than [0-9].
https://www.w3.org/TR/xmlschema-2/#regexs
http://www.fileformat.info/info/unicode/category/Nd/list.htm
Perhaps the usage of \p{N} and \p{L} above is not quite right (I
recall that I tried to find out what exactly the rules for a zone
index are and often you find out that there is not really a precise
definition). My standpoint is that it is the WGs that are responsible
to work out the pattern; the WGs are responsible to decide how strict
they want patterns to be. The pattern in RFC6991 rejects an 'IP
address' of the form 321.1.2.3 or 01.2.3.4 and I think this is
goodness but it is ultimately a decision of the WG producing the YANG
module how the patterns should look like and how strict they are.
And we should separate the discussion of how strict a pattern should
be from the discussion of using unicode constructs or other 'more
recent' constructs in pattern.
/js
--
Juergen Schoenwaelder Jacobs University Bremen gGmbH
Phone: +49 421 200 3587 Campus Ring 1 | 28759 Bremen | Germany
Fax: +49 421 200 3103 <http://www.jacobs-university.de/>
_______________________________________________
netmod mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/netmod