Hi,

Since there is no coverage report on tests in vdsm, if a PEP 8 patch
passes the tests, we still not sure if there is no mistake in it.
Viewing the diff reports on all the changes consumes a lot of time, and
some small but fatal mistakes(like misspelling variable name) can easily
be ignored by human eyes.

So I have a try on the compiler module of Python. I write a tool named
'pydiff'. pydiff parses two Python scripts into Abstract Syntax Trees.
These data structures can reflect the logic of the code, and pydiff
performs a recursive compare on the trees. Then pydiff reports
differences and the corresponding line numbers. In this way, pydiff
ignores code style changes, and reports only logical changes of the code.

I think this tool can save us a lot of time. After a PEP 8 patch passes
vdsm tests and pydiff, I will get some confidence on the patch and it
probably does not break anything in vdsm.

Here is a usage example:

-------- test_o.py --------
def foo(a, b):
pass

if __name__ == '__main__':
A = [1, 2, 3]
print (4, 5, 6), \
"over"
foo(1, 2)
print 'Hello World'


-------- test_n.py --------
def foo(a, b):
pass

if __name__ == '__main__':
A = [1,
2, 3]
print (4, 5, 6), "over"
fooo(
1, 2)
print ('Hello '
'World')


Some differences of the files are just a matter of style. The only
significant difference is the function call "foo()" is misspelled in
"test_n.py".

Run pydiff.py, it will report:

$ python pydiff.py test_*.py
1 difference(s)
first file: test_n.py
second file: test_o.py

((8, 'fooo'), (8, 'foo'))

This report tells us that 'fooo' in line 8 of "test_n.py" is different
from 'foo' in line 8 of "test_o.py".


It can also find insertions or deletions. Here is another simple example:

---- old.py ----
print 'Hello 1'
print 'Hello 2'
print 'Hello 3'
print 'Hello 4'
print 'Hello 5'

---- new.py ----
print 'Hello 1'
print 'Hello 3'
print 'Hello 4'
print 'Hello 5'
print 'Hello 5'

Run pydiff:

$ pydiff old.py new.py
2 difference(s)
first file: old.py
second file: new.py

((2, Printnl([Const('Hello 2')], None)), (2, None))

((5, None), (5, Printnl([Const('Hello 5')], None)))

Here "((2, Printnl([Const('Hello 2')], None)), (2, None))" means there
is a print statement in line 2 of old.py, but no corresponding statement
in new.py, so we can know the statement is deleted in new.py.
"((5, None), (5, Printnl([Const('Hello 5')], None)))" means there is a
print statement in line 5 of new.py, but no corresponding statement in
old.py, so we can know the statement is inserted in new.py.


Sometimes the change in code logic is acceptable, for example, change
"aDict.has_key(Key)" into "Key in aDict". pydiff can report a difference
in this case, but it is up to the user to judge whether it's acceptable.
pydiff is just a tool to help you finding these changes.

I hope it can be helpful for PEP 8 patch reviewers. If you find any
bugs, please let me know. The script is in the attachment.

-- 
Thanks and best regards!

Zhou Zheng Sheng / 周征晟
E-mail: zhshz...@linux.vnet.ibm.com
Telephone: 86-10-82454397

Attachment: pydiff.tar.gz
Description: GNU Zip compressed data

_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel

Reply via email to