On 23/08/2017 12:52, Vladimir Vassilev wrote:
On 08/21/2017 05:14 PM, Robert Wilton wrote:

Hi Acee,

That makes sense.

The other thing that I think that we have got wrong is modelling regex pattern statements. I think that it would be much better if these were written to be less exhaustive and much simpler.

E.g. the "route distinguisher" pattern in draft-ietf-rtgwg-routing-types-09 is defined as this:

          pattern
            '(0:(6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|'
          +     '6[0-4][0-9]{3}|'
          +     '[0-5]?[0-9]{0,3}[0-9]):(429496729[0-5]|'
          +     '42949672[0-8][0-9]|'
          +     '4294967[01][0-9]{2}|429496[0-6][0-9]{3}|'
          +     '42949[0-5][0-9]{4}|'
          +     '4294[0-8][0-9]{5}|429[0-3][0-9]{6}|'
          +     '42[0-8][0-9]{7}|4[01][0-9]{8}|'
          +     '[0-3]?[0-9]{0,8}[0-9]))|'
          + '(1:((([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|'
          +     '25[0-5])\.){3}([0-9]|[1-9][0-9]|'
          +     '1[0-9]{2}|2[0-4][0-9]|25[0-5])):(6553[0-5]|'
          +     '655[0-2][0-9]|'
          +     '65[0-4][0-9]{2}|6[0-4][0-9]{3}|'
          +     '[0-5]?[0-9]{0,3}[0-9]))|'
          + '(2:(429496729[0-5]|42949672[0-8][0-9]|'
          +     '4294967[01][0-9]{2}|'
          +     '429496[0-6][0-9]{3}|42949[0-5][0-9]{4}|'
          +     '4294[0-8][0-9]{5}|'
          + '429[0-3][0-9]{6}|42[0-8][0-9]{7}|4[01][0-9]{8}|'
          +     '[0-3]?[0-9]{0,8}[0-9]):'
          +     '(6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|'
          +     '6[0-4][0-9]{3}|'
          +     '[0-5]?[0-9]{0,3}[0-9]))|'
          + '(6(:[a-fA-F0-9]{2}){6})|'
          + '(([3-57-9a-fA-F]|[1-9a-fA-F][0-9a-fA-F]{1,3}):'
          +     '[0-9a-fA-F]{1,12})';
        }

But I think that it would be much easier to read, and quite possibly more performant to execute, if the pattern regex was written something like the following:

 pattern:
    '(0:[0-9]{1,5}:[0-9]{1,10})|
     (1:([0-9]{1,3}\.){4}:[0-9]{1,5})|
     (2:[0-9]{1,10}:0:[0-9]{1,5})|
     (6(:[a-fA-F0-9]{2}){6})';

Of course, this would allow more invalid values, but most servers would be expected to reject those when it converts them into an internal binary format any way.

What do you, and others, think?
You still need the |(([3-57-9a-fA-F]|[1-9a-fA-F][0-9a-fA-F]{1,3}):[0-9a-fA-F]{1,12}) in the end to not reject valid values though.
Sure, OK.


IMO a pattern statement has value if it absolutely defines the set of valid strings.
It still has value if it also performs some simple checks and removes obvious mistakes.

But even if a value passes the regex filter, it still doesn't guarantee that is the value is correct. Someone could put a typo in there, or perhaps configure a multicast IP address where only unicast addresses are allowed, or put the same IP address on two separate interfaces, or use a IP address that they don't own, etc ...

In general I do not see the benefit of pattern statements that do not reject all invalid string instances. I prefer the original pattern or none at all.
OK, so some potential counter examples:
1) Email address. I understand that the full regex to validate all email addresses is very complex, but checking that it at least contains an @ symbol still has benefit. It would seem that a short imperfect regex is better than a complete perfect regex. 2) A list of VLAN ranges, e.g. want to allow strings that look like this: "1-10,20-400,600,2000-3000", but only with non overlapping values in ascending order. It is easy to write a regex to check that the structure is right, but AFAIK it is hard (impossible?) to write a regex that ensures that the ranges don't overlap and are specified in ascending order.

So, I propose that we use regexes for checking that the string is structurally correct, but don't use regexes to perform numerical range checks of string encoded numbers, since it makes the regexes hard to read/verify, and doesn't improve the readability of the YANG file either.

Thanks,
Rob



Vladimir

Thanks,
Rob

.


_______________________________________________
netmod mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/netmod

Reply via email to