On 08/21/2017 05:14 PM, Robert Wilton wrote:

Hi Acee,

That makes sense.

The other thing that I think that we have got wrong is modelling regex pattern statements. I think that it would be much better if these were written to be less exhaustive and much simpler.

E.g. the "route distinguisher" pattern in draft-ietf-rtgwg-routing-types-09 is defined as this:

          pattern
            '(0:(6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|'
          +     '6[0-4][0-9]{3}|'
          +     '[0-5]?[0-9]{0,3}[0-9]):(429496729[0-5]|'
          +     '42949672[0-8][0-9]|'
          +     '4294967[01][0-9]{2}|429496[0-6][0-9]{3}|'
          +     '42949[0-5][0-9]{4}|'
          +     '4294[0-8][0-9]{5}|429[0-3][0-9]{6}|'
          +     '42[0-8][0-9]{7}|4[01][0-9]{8}|'
          +     '[0-3]?[0-9]{0,8}[0-9]))|'
          + '(1:((([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|'
          +     '25[0-5])\.){3}([0-9]|[1-9][0-9]|'
          +     '1[0-9]{2}|2[0-4][0-9]|25[0-5])):(6553[0-5]|'
          +     '655[0-2][0-9]|'
          +     '65[0-4][0-9]{2}|6[0-4][0-9]{3}|'
          +     '[0-5]?[0-9]{0,3}[0-9]))|'
          + '(2:(429496729[0-5]|42949672[0-8][0-9]|'
          +     '4294967[01][0-9]{2}|'
          +     '429496[0-6][0-9]{3}|42949[0-5][0-9]{4}|'
          +     '4294[0-8][0-9]{5}|'
          +     '429[0-3][0-9]{6}|42[0-8][0-9]{7}|4[01][0-9]{8}|'
          +     '[0-3]?[0-9]{0,8}[0-9]):'
          +     '(6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|'
          +     '6[0-4][0-9]{3}|'
          +     '[0-5]?[0-9]{0,3}[0-9]))|'
          + '(6(:[a-fA-F0-9]{2}){6})|'
          + '(([3-57-9a-fA-F]|[1-9a-fA-F][0-9a-fA-F]{1,3}):'
          +     '[0-9a-fA-F]{1,12})';
        }

But I think that it would be much easier to read, and quite possibly more performant to execute, if the pattern regex was written something like the following:

 pattern:
    '(0:[0-9]{1,5}:[0-9]{1,10})|
     (1:([0-9]{1,3}\.){4}:[0-9]{1,5})|
     (2:[0-9]{1,10}:0:[0-9]{1,5})|
     (6(:[a-fA-F0-9]{2}){6})';

Of course, this would allow more invalid values, but most servers would be expected to reject those when it converts them into an internal binary format any way.

What do you, and others, think?
You still need the |(([3-57-9a-fA-F]|[1-9a-fA-F][0-9a-fA-F]{1,3}):[0-9a-fA-F]{1,12}) in the end to not reject valid values though.

IMO a pattern statement has value if it absolutely defines the set of valid strings. In general I do not see the benefit of pattern statements that do not reject all invalid string instances. I prefer the original pattern or none at all.

Vladimir

Thanks,
Rob

_______________________________________________
netmod mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/netmod

Reply via email to