I am fairly confident that I can write a suite of scripts that will fix all major problems discussed in pep8 in about a week.
I say this because imo a pep8 fixer can be much simpler than either pylint or 2to3. Indeed, there are three possible ways to write code munger: 1. Use ast trees. This is the way followed by pylint and 2to3. The advantage of this approach is that the ast (parse) trees make discovering the detailed structure of a program very easy. The disadvantage is that character-level information is (very) difficult to obtain. 2. Use tokens. It's harder to obtain structural (parser) information, but easier to obtain character data. indent.py uses this way, iirc. So does Leo's pretty printer. 3. Use strings for (almost) everything. This is the way I shall adopt. There is no doubt in my mind that this is the easiest way. It may seem counter-intuitive: scripts must do extra work to discover the structure of the program being munged. The great advantage of this way is the scripts always have the actual strings of the text to work on. Thus, replacing one string by another is trivial. And it is this string replacement that is the essential operation. As it turns out, I have lots of experience with this strategy. It underlies all of Leo's importers. It's simple, it works, and I am comfortable with it. Also, Leo's importers already provide methods that will discover the range of text covered by a class or def, which is really all that needs to be done. The pep8 fixer will consist of a series of simple, self-contained scripts. Each script will apply one particular fix to a string, possibly using global context. Each script will have the same basic organization. It will be a character-by-character **scanner** that understands Python strings, comments and (for some scanners) classes and defs (functions or methods). And maybe also Leo's doc parts. Writing such scanners is second nature to me. Conceptually they are very simple simple. There are two kinds of pep8 fixers: local fixers and global fixers. Local fixers require no context. Local fixers will clean blank lines, replace tabs by spaces, and split long lines into shorter lines. Global fixers work (conceptually) on a list of files. For example, the fixer that changes a class name from xxxYyy to XxxYyy should work on all the files of a project. For example, it should work on all files in Leo's core. That allows the fixer to pre-scan for conflicts before making any changes. Global fixers will likely have two passes. The first pass will construct a global symbol table. The fixer will likely abort if the fix might map distinct input symbols into the same output symbol. Assuming there are no collisions, the second pass will substitute the approved spelling for the dubious spelling. I shall definitely write many unit tests first as a way of addressing various design issues. I know from experience (and from my initial ruminations) that designing the unit tests will uncover design questions in the easiest way possible. Packaging will be interesting. Obviously, I want to run the fixer as a Leo script, but I shall also want to package it for use by those who do not use Leo. This suggest that each fixer will convert a string to a string. Wrapper functions will allow the primary fixers to work in various contexts. That's about it. I have studied 2to3 and pylint in enough detail to be quite confident that my approach is fundamentally simpler than using ast's or streams of tokens. It would also be reasonable to convert 2to3 into a pep8 fixer. After all, both rewrite code. But the mechanics behind 2to3 are horrendously complex. I want to base my code on something dirt simple: a generic python scanner. It's the way I think. More importantly, many fixers are fundamentally involved with characters. Trying to "abstract" characters away actually makes things harder. Edward P.S. Speed is completely irrelevant here. Or rather, the only thing that matters is how fast I can write and debug these scripts :-) P.P.S. For reasons that are not completely clear to me, I have always loved writing this kind of code. I'm pumped about this project. I'm going to make a bit of a race out of this. The goal: to complete this project using TDD in less than a week. EKR -- You received this message because you are subscribed to the Google Groups "leo-editor" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/leo-editor?hl=en.
