I quite like these suggestions, especially number 1. A few notes: > Proposed output for messages:
> +-----------+------------+---- > ---------+ > |message id |occurrences |reference | > +===========+============+=============+ > +-----------+------------+-------------+ > |E0602 |2 |PEP 333 | > +-----------+------------+-------------+ > |W0612 |1 |http:// | > +-----------+------------+-------------+ > |W0301 |1 |file:///....-+ > +-----------+------------+-------------+ > |F0401 |1 |(field empty)| > +-----------+------------+-------------+ > > where the links might be to the pylint error code wiki, python peps, > pylint documentation, or local documentation. Would this be for html output mode? I like the idea, but I can't imagine someone wanting to transcribe the URLs into their browser from the text output. > These regular expressions describe the naming conventions of a particular > project. There does not seem to be an official style for Python: even the > standard library uses camelCase in some modules and underscore_as_separator > in other modules. I think PEP-8 (http://www.python.org/dev/peps/pep-0008/) is pretty close to an 'official' set of style conventions (though it comes with the caveat that if a project has its own pre-existing style conventions, PEP-8 shouldn't supersede them which is why, as you said, even some standard libraries use naming conventions that contradict it), and is what Pylint bases its default regular expressions on. > It may be possible to generate a useful English description of the problem > by looking at the expression and how it fails to match the name. For > example, whether it encounters an invalid character (if so, indicate which > one), whether it has an issue with the length of the name etc. Maybe a nice > challenge for a student? A simpler implementation might just be to associate a string with each regex that gets printed with the accompanying message (e.g. "Function names should be in pothole_case", "Class names should be in CapWords"). If someone wants to define their own regexes, they could also optionally include a string briefly describing the rule or rationale behind each. > It could be a Wiki where experienced users can add the motivation behind the > checks pylint does. Adding text to a Wiki has a lower barrier of entry than > submitting a patch. Conveniently enough, such a thing exists already! http://pylint-messages.wikidot.com/ Cheers, Colin On Fri, Mar 12, 2010 at 8:58 PM, Maarten ter Huurne <maar...@treewalker.org> wrote: > > On Friday 12 March 2010, Sarah Strong wrote: > > > Proposed output: > > > > ************* Module Kontroller > > W: 9: Bad indentation. Found 2 spaces, expected 4 > > W: 10: Bad indentation. Found 2 spaces, expected 4 > > W: 11: Bad indentation. Found 2 spaces, expected 4 > > W: 12: Bad indentation. Found 2 spaces, expected 4 > > [4 more Bad indentation messages, use --unabridged to display them all] > > In addition to avoiding discouragment of new users, it makes the more > serious problems stand out more because they are not lost in a sea of > repeated warnings. If pylint finds a bug on the first run it makes a good > first impression on the user. > > > Errors such as > > C:111:Kontroller.addWord: Invalid name "addWord" (should match > > [a-z_][a-z0-9_]{2,30}$) > > C: 89:Kontroller.checkUserId: Invalid name "checkUserId" (should match > > [a-z_][a-z0-9_]{2,30}$) > > > > are a bit confusing because the user may be unsure of why such a pattern > > match > > is necessary. > > These regular expressions describe the naming conventions of a particular > project. There does not seem to be an official style for Python: even the > standard library uses camelCase in some modules and underscore_as_separator > in other modules. > > So it is expected that a project will customize these expressions to match > the conventions of that specific project. Regular expressions are a very > good way of allowing that kind of customization. Unfortunately, not everyone > is experienced in reading them. > > It may be possible to generate a useful English description of the problem > by looking at the expression and how it fails to match the name. For > example, whether it encounters an invalid character (if so, indicate which > one), whether it has an issue with the length of the name etc. Maybe a nice > challenge for a student? > > > Proposed output for messages: > > > > > > +-----------+------------+-------------+ > > |message id |occurrences |reference | > > +===========+============+=============+ > > +-----------+------------+-------------+ > > |E0602 |2 |PEP 333 | > > +-----------+------------+-------------+ > > |W0612 |1 |http:// | > > +-----------+------------+-------------+ > > |W0301 |1 |file:///....-+ > > +-----------+------------+-------------+ > > |F0401 |1 |(field empty)| > > +-----------+------------+-------------+ > > > > where the links might be to the pylint error code wiki, python peps, > > pylint documentation, or local documentation. > > PMD, a static code checker for Java, has an explanation of every built-in > rule on its web site: > http://pmd.sourceforge.net/rules/optimizations.html#AddEmptyString > This is really useful if you are wondering what the value is of obeying a > certain rule. > > Maybe it could be done with systematic URLs: (example; URL does not exist) > http://www.logilab.org/project/pylint/rules/E0602 > It could be a Wiki where experienced users can add the motivation behind the > checks pylint does. Adding text to a Wiki has a lower barrier of entry than > submitting a patch. > > > * Possible output improvement #3* > > > > Modify pylint's rating system not to give negative ratings out of ten. > > This doesn't match up to most people's expectations of how a rating > > works. > > The rating system is part of the configuration. The default is: > > evaluation=10.0 - ((float(5 * error + warning + refactor + convention) / > statement) * 10) > > So it computes a penalty and then subtracts that from 10.0. Since there is > no limit to the amount of issues found, it can become negative no matter > whether you set the maximum to 10 or 10000. > > Clipping the rating at 0 is not a good solution, since users would like to > see their rating improve when they fix issues. If the rating improves from > -6000 to -5000 but both are clipped to 0, there is no visible progress, > which is discouraging. > > You could change the weight of one issue (the "* 10" in the formula) so in > any code that is not deliberately designed to trigger as many warnings as > possible would get a rating above 0. But then it would give overly high > marks for code that is neither very poor nor great. > > A non-linear rating is probably the best solution. This also fits the > typical progression in the number of issues found: initially there will be > many violations because of consistent errors and/or lack of customization of > the configuration to a project's conventions (coding style). Fixing those is > relatively easy, so in the first phase the number of violations drops very > quickly. It would make sense that the rating would increase a bit because of > this, but not as dramatically as it does now. Conversely, the last issues to > be fixed are probably the hardest, so fixing just a handful of those should > already have a noticable effect on the rating. > > I do think the rating has some value: if you compute it over a long time, > you can get an impression whether the quality of your code base is improving > or deteriorating over time. Currently the absolute score does not have much > meaning though. It would be useful if someone could attempt to tune it. > > Bye, > Maarten > _______________________________________________ > Python-Projects mailing list > Python-Projects@lists.logilab.org > http://lists.logilab.org/mailman/listinfo/python-projects _______________________________________________ Python-Projects mailing list Python-Projects@lists.logilab.org http://lists.logilab.org/mailman/listinfo/python-projects