I quite like these suggestions, especially number 1. A few notes:

> Proposed output for messages:

> +-----------+------------+----
> ---------+
> |message id |occurrences |reference    |
> +===========+============+=============+
> +-----------+------------+-------------+
> |E0602      |2           |PEP 333      |
> +-----------+------------+-------------+
> |W0612      |1           |http://      |
> +-----------+------------+-------------+
> |W0301      |1           |file:///....-+
> +-----------+------------+-------------+
> |F0401      |1           |(field empty)|
> +-----------+------------+-------------+
>
> where the links might be to the pylint error code wiki, python peps,
> pylint documentation, or local documentation.

Would this be for html output mode? I like the idea, but I can't
imagine someone wanting to transcribe the URLs into their browser from
the text output.

> These regular expressions describe the naming conventions of a particular
> project. There does not seem to be an official style for Python: even the
> standard library uses camelCase in some modules and underscore_as_separator
> in other modules.

I think PEP-8 (http://www.python.org/dev/peps/pep-0008/) is pretty
close to an 'official' set of style conventions (though it comes with
the caveat that if a project has its own pre-existing style
conventions, PEP-8 shouldn't supersede them which is why, as you said,
even some standard libraries use naming conventions that contradict
it), and is what Pylint bases its default regular expressions on.

> It may be possible to generate a useful English description of the problem
> by looking at the expression and how it fails to match the name. For
> example, whether it encounters an invalid character (if so, indicate which
> one), whether it has an issue with the length of the name etc. Maybe a nice
> challenge for a student?

A simpler implementation might just be to associate a string with each
regex that gets printed with the accompanying message (e.g. "Function
names should be in pothole_case", "Class names should be in
CapWords"). If someone wants to define their own regexes, they could
also optionally include a string briefly describing the rule or
rationale behind each.

> It could be a Wiki where experienced users can add the motivation behind the
> checks pylint does. Adding text to a Wiki has a lower barrier of entry than
> submitting a patch.

Conveniently enough, such a thing exists already!
http://pylint-messages.wikidot.com/

Cheers,
Colin

On Fri, Mar 12, 2010 at 8:58 PM, Maarten ter Huurne
<maar...@treewalker.org> wrote:
>
> On Friday 12 March 2010, Sarah Strong wrote:
>
> > Proposed output:
> >
> > ************* Module Kontroller
> > W:  9: Bad indentation. Found 2 spaces, expected 4
> > W: 10: Bad indentation. Found 2 spaces, expected 4
> > W: 11: Bad indentation. Found 2 spaces, expected 4
> > W: 12: Bad indentation. Found 2 spaces, expected 4
> > [4 more Bad indentation messages, use --unabridged to display them all]
>
> In addition to avoiding discouragment of new users, it makes the more
> serious problems stand out more because they are not lost in a sea of
> repeated warnings. If pylint finds a bug on the first run it makes a good
> first impression on the user.
>
> > Errors such as
> > C:111:Kontroller.addWord: Invalid name "addWord" (should match
> > [a-z_][a-z0-9_]{2,30}$)
> > C: 89:Kontroller.checkUserId: Invalid name "checkUserId" (should match
> > [a-z_][a-z0-9_]{2,30}$)
> >
> > are a bit confusing because the user may be unsure of why such a pattern
> > match
> > is necessary.
>
> These regular expressions describe the naming conventions of a particular
> project. There does not seem to be an official style for Python: even the
> standard library uses camelCase in some modules and underscore_as_separator
> in other modules.
>
> So it is expected that a project will customize these expressions to match
> the conventions of that specific project. Regular expressions are a very
> good way of allowing that kind of customization. Unfortunately, not everyone
> is experienced in reading them.
>
> It may be possible to generate a useful English description of the problem
> by looking at the expression and how it fails to match the name. For
> example, whether it encounters an invalid character (if so, indicate which
> one), whether it has an issue with the length of the name etc. Maybe a nice
> challenge for a student?
>
> > Proposed output for messages:
> >
> >
> > +-----------+------------+-------------+
> > |message id |occurrences |reference    |
> > +===========+============+=============+
> > +-----------+------------+-------------+
> > |E0602      |2           |PEP 333      |
> > +-----------+------------+-------------+
> > |W0612      |1           |http://      |
> > +-----------+------------+-------------+
> > |W0301      |1           |file:///....-+
> > +-----------+------------+-------------+
> > |F0401      |1           |(field empty)|
> > +-----------+------------+-------------+
> >
> > where the links might be to the pylint error code wiki, python peps,
> > pylint documentation, or local documentation.
>
> PMD, a static code checker for Java, has an explanation of every built-in
> rule on its web site:
>  http://pmd.sourceforge.net/rules/optimizations.html#AddEmptyString
> This is really useful if you are wondering what the value is of obeying a
> certain rule.
>
> Maybe it could be done with systematic URLs: (example; URL does not exist)
>  http://www.logilab.org/project/pylint/rules/E0602
> It could be a Wiki where experienced users can add the motivation behind the
> checks pylint does. Adding text to a Wiki has a lower barrier of entry than
> submitting a patch.
>
> > * Possible output improvement #3*
> >
> > Modify pylint's rating system not to give negative ratings out of ten.
> > This doesn't match up to most people's expectations of how a rating
> > works.
>
> The rating system is part of the configuration. The default is:
>
> evaluation=10.0 - ((float(5 * error + warning + refactor + convention) /
> statement) * 10)
>
> So it computes a penalty and then subtracts that from 10.0. Since there is
> no limit to the amount of issues found, it can become negative no matter
> whether you set the maximum to 10 or 10000.
>
> Clipping the rating at 0 is not a good solution, since users would like to
> see their rating improve when they fix issues. If the rating improves from
> -6000 to -5000 but both are clipped to 0, there is no visible progress,
> which is discouraging.
>
> You could change the weight of one issue (the "* 10" in the formula) so in
> any code that is not deliberately designed to trigger as many warnings as
> possible would get a rating above 0. But then it would give overly high
> marks for code that is neither very poor nor great.
>
> A non-linear rating is probably the best solution. This also fits the
> typical progression in the number of issues found: initially there will be
> many violations because of consistent errors and/or lack of customization of
> the configuration to a project's conventions (coding style). Fixing those is
> relatively easy, so in the first phase the number of violations drops very
> quickly. It would make sense that the rating would increase a bit because of
> this, but not as dramatically as it does now. Conversely, the last issues to
> be fixed are probably the hardest, so fixing just a handful of those should
> already have a noticable effect on the rating.
>
> I do think the rating has some value: if you compute it over a long time,
> you can get an impression whether the quality of your code base is improving
> or deteriorating over time. Currently the absolute score does not have much
> meaning though. It would be useful if someone could attempt to tune it.
>
> Bye,
>                Maarten
> _______________________________________________
> Python-Projects mailing list
> Python-Projects@lists.logilab.org
> http://lists.logilab.org/mailman/listinfo/python-projects
_______________________________________________
Python-Projects mailing list
Python-Projects@lists.logilab.org
http://lists.logilab.org/mailman/listinfo/python-projects

Reply via email to