We did some further analysis of the regexp.  We found:

1. Replacing \w+(?:\-*\w+)+ with \w[-\w]+ turns 5 hours into a few ms.
 It still varies based on the length of the string, but it maxed out
at 100ms for hrSWInstalledLastUpdateTime.

2. This regexp is different from the one in the perl module --
significantly, it captures multiple dot-separated words, and requires
a period and ends with only digits; the perl version captures only a
single word, doesn't require dots, and captures anything at the end.

If we switch to the one used in the perl module:

perl: ($vars =~ /^((?:\.\d+)+|(?:\w+(?:\-*\w+)+))\.?(.*)$/);
python: tagiidRe = re.compile(r'^((?:\.\d+)+|(?:\w+(?:\-*\w+)+))\.?(.*?)$')

Regex match against tag ifName took 0.000021 seconds
('ifName', '')
Regex match against tag icmpInMsgs took 0.000007 seconds
('icmpInMsgs', '')
Regex match against tag ifHCOutBroadcastPkts took 0.000006 seconds
('ifHCOutBroadcastPkts', '')
Regex match against tag hrSWInstalledLastUpdateTime took 0.000006 seconds
('hrSWInstalledLastUpdateTime', '')
Regex match against tag ifName.6 took 0.000006 seconds
('ifName', '6')
Regex match against tag icmpInMsgs.6 took 0.000005 seconds
('icmpInMsgs', '6')
Regex match against tag ifHCOutBroadcastPkts.6 took 0.000006 seconds
('ifHCOutBroadcastPkts', '6')
Regex match against tag hrSWInstalledLastUpdateTime.6 took 0.000006 seconds
('hrSWInstalledLastUpdateTime', '6')
Regex match against tag ifName.1.2.3.4 took 0.000006 seconds
('ifName', '1.2.3.4')
Regex match against tag icmpInMsgs.1.2.3.4 took 0.000005 seconds
('icmpInMsgs', '1.2.3.4')
Regex match against tag ifHCOutBroadcastPkts.1.2.3.4 took 0.000006 seconds
('ifHCOutBroadcastPkts', '1.2.3.4')
Regex match against tag hrSWInstalledLastUpdateTime.1.2.3.4 took
0.000006 seconds
('hrSWInstalledLastUpdateTime', '1.2.3.4')

I propose using this version, since:

a) it matches the same as the perl version, which is presumably
heavily used.  This gets rid of the need to answer "what the heck is
this supposed to match?" since the answer is "the same as the perl
version"
b) it works efficiently and correctly


I would actually advocate changing both versions to use (?:\w[-\w]+)
as the second clause, since that's unambiguous and you can imagine
that an ambiguous regexp would require lots of work to ensure that
none of the possible matches worked, but that's a separate issue.

  Bill

------------------------------------------------------------------------------
This SF.net email is sponsored by:
High Quality Requirements in a Collaborative Environment.
Download a free trial of Rational Requirements Composer Now!
http://p.sf.net/sfu/www-ibm-com
_______________________________________________
Net-snmp-coders mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/net-snmp-coders

Reply via email to