On Sun, 2008-12-21 at 02:57 +0800, Cedric Vivier wrote:
> Hey gendarmers,
> 
> This new rule in the Correctness category emits a defect when a
> regular expression string provided to methods requiring those is an
> invalid pattern (and would consequently throw an exception at
> run-time).

Nice one :-)

> Bad example:
> Regex re = new Regex ("^\\"); //Invalid end of pattern
> 
> Good example:
> Regex re = new Regex (@"^\\");
> 
> Bad example 2:
> Regex.IsMatch (input, "^([a-z)*"); //Unterminated [] set
> 
> Good example 2:
> Regex.IsMatch (input, "^([a-z])*");
> 
> 
> For example a defect like example 2 looks like :
> 
> 3. ProvideCorrectRegexPatternRule
> 
> Problem: An invalid regular expression pattern is provided to a 
> method/constructor requiring a valid pattern.
> * Severity: High, Confidence: High
> * Target:   System.Void RegexModule::.cctor()
> * Source:   /home/cedric/tmp/regex.boo(≈1)
> * Details:  Pattern '[a-z' is invalid. Reason: parsing "[a-z" - Unterminated 
> [] set.
> 
> Solution: Fix the invalid regular expression.
> More info available at: 
> http://www.mono-project.com/Gendarme.Rules.Correctness#ProvideCorrectRegexPatternRule
> 
> 
> Please review,

...

> +
> +namespace Gendarme.Rules.Correctness {
> +
> +       /// <summary>
> +       /// This rule checks if methods/constructors requiring a regular 
> expression
> +       /// string are provided a valid pattern.
> +       /// </summary>
> +       /// <example>
> +       /// Bad example:
> +       /// <code>
> +       /// Regex re = new Regex ("^\\"); //Invalid end of pattern
> +       /// </code>
> +       /// </example>
> +       /// <example>
> +       /// Good example:
> +       /// <code>
> +       /// Regex re = new Regex (@"^\\");
> +       /// </code>
> +       /// </example>
> +       /// <example>
> +       /// Bad example:
> +       /// <code>
> +       /// Regex re = new Regex ("([a-z)*"); //Unterminated [] set
> +       /// </code>
> +       /// </example>
> +       /// <example>
> +       /// Good example:
> +       /// <code>
> +       /// Regex re = new Regex ("([a-z])*");
> +       /// </code>
> +       /// </example>
> +       /// <example>
> +       /// Bad example:
> +       /// <code>
> +       /// Regex.IsMatch (code, @"(\w)-\2"); //Reference to undefined group 
> number 2
> +       /// </code>
> +       /// </example>
> +       /// <example>
> +       /// Good example:
> +       /// <code>
> +       /// Regex.IsMatch (code, @"(\w)-\1");
> +       /// </code>
> +       /// </example>
> +
> +       [Problem ("An invalid regular expression pattern is provided to a 
> method/constructor requiring a valid pattern.")]

I think we can skip the "requiring a valid pattern".

> +       [Solution ("Fix the invalid regular expression.")]
> +       [EngineDependency (typeof (OpCodeEngine))]
> +       public class ProvideCorrectRegexPatternRule : Rule, IMethodRule {
> +
> +               static OpCodeBitmask callsAndNewobjBitmask = 
> BuildCallsAndNewobjOpCodeBitmask ();
> +
> +               MethodDefinition method;
> +
> +               const string RegexClass = 
> "System.Text.RegularExpressions.Regex";
> +               const string ValidatorClass = 
> "System.Configuration.RegexStringValidator";
> +
> +               public override void Initialize (IRunner runner)
> +               {
> +                       base.Initialize (runner);
> +
> +                       Runner.AnalyzeModule += delegate (object o, 
> RunnerEventArgs e) {
> +                               bool usingRegexClass = 
> e.CurrentAssembly.Name.Name == "System"
> +                                                      || 
> e.CurrentModule.TypeReferences.ContainsType (RegexClass);
> +                               bool usingValidatorClass = 
> e.CurrentAssembly.Runtime >= TargetRuntime.NET_2_0
> +                                                          && 
> (e.CurrentAssembly.Name.Name == "System.Configuration"
> +                                                             || 
> e.CurrentModule.TypeReferences.ContainsType (ValidatorClass));
> +                               Active = usingRegexClass | 
> usingValidatorClass;
> +                       };
> +               }
> +
> +               void CheckPattern (Instruction ins, int argumentIndex)
> +               {
> +                       Instruction ld = ins;
> +                       for (int i = 0; i < argumentIndex; ++i)
> +                               ld = ld.Previous;

I don't think this will work if expression are used for parameters. E.g.

        new RegEx (i == 0 ? "bad" : "good");

Using the TraceBack rock (overload with an offset) should work.

> +                       switch (ld.OpCode.Code) {
> +                       case Code.Ldstr:
> +                               CheckPattern (ins, (string) ld.Operand);
> +                               break;
> +                       case Code.Ldsfld:
> +                               {

No need for '{ }' here, but if one is used it should start on the
previous line.

> +                                       FieldReference f = (FieldReference) 
> ld.Operand;
> +                                       if (f.Name != "Empty" || 
> f.DeclaringType.FullName != "System.String")
> +                                               return;
> +                               }
> +                               CheckPattern (ins, null);
> +                               break;
> +                       case Code.Ldnull:
> +                               CheckPattern (ins, null);
> +                               break;
> +                       }
> +               }
> +
> +               void CheckPattern (Instruction ins, string pattern)
> +               {
> +                       if (string.IsNullOrEmpty (pattern)) {
> +                               Runner.Report (method, ins, Severity.High, 
> Confidence.High, "Pattern is null or empty.");
> +                               return;
> +                       }
> +
> +                       try {
> +                               new Regex (pattern);
> +                       } catch (Exception e) {

Is it possible to catch something more specific than Exception ? this
should have triggered something in "make self-test".

> +                               string msg = string.Format ("Pattern '{0}' is 
> invalid. Reason: {1}", pattern, e.Message);
> +                               Runner.Report (method, ins, Severity.High, 
> Confidence.High, msg);
> +                       }
> +               }
> +
> +               void CheckCall (Instruction ins, MethodDefinition call)
> +               {
> +                       //check only constructors and static non-property 
> methods
> +                       if (call.Name != MethodDefinition.Ctor && 
> (call.HasThis || call.IsProperty ()))
> +                               return;
> +                       if (!call.HasParameters)
> +                               return;
> +                       if (call.DeclaringType.FullName != RegexClass && 
> call.DeclaringType.FullName != ValidatorClass)
> +                               return;
> +
> +                       for (int i = 0; i < call.Parameters.Count; ++i) {
> +                               if ((call.Parameters [i].Name == "pattern" || 
> call.Parameters [i].Name == "regex")
> +                                   && call.Parameters 
> [i].ParameterType.FullName == "System.String") {

That's why I like/prefer foreach (even if it does allocate some
memory ;-) because it directly gives you an object to use. Otherwise you
often end up getting this same object over and over again (3 times
here*) which hurts readability and can also affect performance (no much
in general, a bit more the property is virtual, again a bit more when
indexers are used).

        * there's a rule request for this in bugzilla

Another way to avoid "call.Parameters [i]" three times (and keeping the
for-loop) is to assign it once to a variable.

> +                                       CheckPattern (ins, 
> call.Parameters.Count - i);
> +                                       return;
> +                               }
> +                       }
> +               }
> +
> +               public RuleResult CheckMethod (MethodDefinition method)
> +               {
> +                       if (!method.HasBody)
> +                               return RuleResult.DoesNotApply;
> +
> +                       this.method = method;
> +
> +                       //is there any interesting opcode in the method?
> +                       if (!callsAndNewobjBitmask.Intersect 
> (OpCodeEngine.GetBitmask (method)))
> +                               return RuleResult.DoesNotApply;
> +
> +                       foreach (Instruction ins in method.Body.Instructions) 
> {
> +                               if (!callsAndNewobjBitmask.Get 
> (ins.OpCode.Code))
> +                                       continue;
> +
> +                               MethodReference call = (MethodReference) 
> ins.Operand;
> +                               CheckCall (ins, call.Resolve ());

Resolve can have big side effects (like loading another assembly) so
it's best to avoid it whenever possible or, more likely, delay it. E.g.
make all the checks possible on the MethodReference then Resolve and
make the last checks on the MethodDefinition.

> +                       }
> +
> +                       return Runner.CurrentRuleResult;
> +               }
> +
> +               static OpCodeBitmask BuildCallsAndNewobjOpCodeBitmask ()
> +               {
> +                       #if true
> +                               return new OpCodeBitmask (0x8000000000, 
> 0x4400000000000, 0x0, 0x0);
> +                       #else
> +                               OpCodeBitmask mask = new OpCodeBitmask ();
> +                               mask.UnionWith (OpCodeBitmask.Calls);
> +                               mask.Set (Code.Newobj);
> +                               return mask;
> +                       #endif
> +               }
> +       }
> +}
> diff --git 
> a/gendarme/rules/Gendarme.Rules.Correctness/Test/ProvideCorrectRegexPatternTest.cs
>  
> b/gendarme/rules/Gendarme.Rules.Correctness/Test/ProvideCorrectRegexPatternTest.cs
> new file mode 100644
> index 0000000..4942925
> --- /dev/null
> +++ 
> b/gendarme/rules/Gendarme.Rules.Correctness/Test/ProvideCorrectRegexPatternTest.cs

Tests looks great, but you need an extra one where expressions are used
as parameters.

> diff --git a/gendarme/rules/common.make b/gendarme/rules/common.make
> index b8ba71d..0be1059 100644
> --- a/gendarme/rules/common.make
> +++ b/gendarme/rules/common.make
> @@ -7,7 +7,7 @@ EXTRA_TESTS_OPTIONS := $(TESTS_OPTIONS)
>  
>  console_runner=../../bin/gendarme.exe
>  framework=../../bin/Gendarme.Framework.dll
> -common_tests=../Test.Rules/Test.Rules.dll
> +common_tests=../Test.Rules/Test.Rules.dll,System.Configuration

That should be added only to the Correctness test assembly (not every
one of them, unless the majority requires it).
 
>  prefixed_rules_category = $(shell expr "$(PWD)" : '.*\(Gendarme.Rules.*\)')
>  rules_category = $(shell echo $(prefixed_rules_category) | cut -c 16-)
> -- 
> 1.6.0.3

Thanks,
Sebastien


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Gendarme" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/gendarme?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to