Hi,

On 12/10/2017 18:15, Acee Lindem (acee) wrote:
Hi Adam,

On 10/12/17, 12:12 PM, "Adam Roach" <[email protected]> wrote:

On 10/11/17 20:16, Acee Lindem (acee) wrote:
____

There are several patterns in the YANG definition that perform
significant
restriction of numbers (e.g., to ensure they don't fall outside the
range
that
can be stored in 16 or 32 bits). In many cases, these patterns include
the
ability to zero-prefix some (but not all) decimal values. For example,
the
production for route-origin would allow leading zeros in "2:0100:0555"
but not
in "2:04294967295:065535" (even though "2:4294967295:65535" is okay). I
don't
know offhand whether it makes sense to allow leading zeros in these
fields, but
I would argue that the production should be consistent in allowing or
disallowing them. This issue arises in various forms in route-target,
ipv6-route-target, route-origin, and ipv6-route-origin.
We’ll look at this and get back to you - a lot of time has already gone
into formulating and testing these patterns.

Yes, and it would be a shame if that work resulted in publishing
patterns with known issues.

This flaw arises in three formulations (each of which appear multiple
times), and would be quite easy to fix. These fixes should be obvious by
inspection.

32 bits (0-4,294,967,295)
   Replace: [0-3]?[0-9]{0,8}[0-9]
   With:    [1-3][0-9]{9}|[1-9][0-9]{0,8}|0

16 bits (0-65535)
   Replace: [0-5]?[0-9]{0,3}[0-9]
   With:    [1-5][0-9]{4}|[1-9][0-9]{0,3}|0

8 bits (0-255)
   Replace: [01]?[0-9]?[0-9]
   With:    1[0-9]{2}|[1-9]?[0-9]
Yes - this doesn’t appear to be a complicate fix at all. We’re going to
get more eyes on it and do some tests with https://yangcatalog.org/yangre/
  but we should be able to fix this.
For what its worth, and I'm not proposing this is changed now given that this has been discussed previously, but I do think that this is another example of the dangers inherent with specifying precise, but more complex, regular expressions in YANG for checking numerical values.

E.g. I still prefer "[0-9]{0,10}" instead of "429496729[0-5]|42949672[0-8][0-9]|4294967[01][0-9]{2}|429496[0-6][0-9]{3}|42949[0-5][0-9]{4}|4294[0-8][0-9]{5}|429[0-3][0-9]{6}|42[0-8][0-9]{7}|4[01][0-9]{8}|[1-3][0-9]{9}|[1-9][0-9]{0,8}|0".

Yes, it allows some invalid numerical values, which would presumably fail when the value is converted to a internal numerical format, but it is easier to read/review, and is harder to get wrong.  Besides even if a stricter regex validates that it the value is syntactically correct it still doesn't mean that the correct intended value has been used ...



____

As an aside: replacing "[0-9]" with "\d" everywhere would make these
patterns easier to read in general, but this is merely a readability
improvement rather than a bug fix. Compare:

          + '(2:(429496729[0-5]|42949672[0-8][0-9]|'
          +     '4294967[01][0-9]{2}|'
          +     '429496[0-6][0-9]{3}|42949[0-5][0-9]{4}|'
          +     '4294[0-8][0-9]{5}|'
          + '429[0-3][0-9]{6}|42[0-8][0-9]{7}|4[01][0-9]{8}|'
          +     '[1-3][0-9]{9}|[1-9][0-9]{0,8}|0):'
          +     '(6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|'
          +     '6[0-4][0-9]{3}|'
          +     '[1-5][0-9]{4}|[1-9][0-9]{0,3}|0))|'

Becomes:

          + '(2:(429496729[0-5]|42949672[0-8]\d|'
          +     '4294967[01]\d{2}|'
          +     '429496[0-6]\d{3}|42949[0-5]\d{4}|'
          +     '4294[0-8]\d{5}|'
          +     '429[0-3]\d{6}|42[0-8]\d{7}|4[01]\d{8}|'
          +     '[1-3]\d{9}|[1-9]\d{0,8}|0):'
          +     '(6553[0-5]|655[0-2]\d|65[0-4]\d{2}|'
          +     '6[0-4]\d{3}|'
          +     '[1-5]\d{4}|[1-9]\d{0,3}|0))|'

Although this is a somewhat controversial subject, we used “[0-9]" for
portability for implementations using non-standard regular expression
parsers.
The XML regex language that is reused by YANG is built for Unicode support.  Hence '\d' matches all digit characters in all scripts, i.e. it actually matches 500+ characters rather than just the ASCII digits 0-9, and in most network configuration models that is probably not what is wanted.  I'm not sure whether it matters that \d matches a much larger class of characters, but at least with [0-9] it is entirely unambiguous.

Thanks,
Rob

Thanks,
Acee


/a
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

Reply via email to