Thank you for the detailed and informative reply.

I had a closer look at the Review Board diff algorithm and was rather
overwhelmed. Re-using the code for my simple purpose would be overkill.
But there still isn't any tool to work on the unified diff format in python.
I found some respite with Google's "diff-match-patch" library, which I'm
currently playing around with. The library cannot handle the full blown
unified diff format (with the Index, ---, +++ lines, etc) but seems to
handle the diff chunks along with the range information quite well.

I proceeded to split the entire diff into chunks and passed it to the
patch library which converts it into a saner format, and I'm taking the 
output of that and converting it to a two-column HTML table, with the entry
* on both sides if there isn't a change in the line
* on the left side if the line has been removed
* on the right if the line has been added.

I was hoping if you could tell me if I'm going in the right direction.

So far, I've been testing it on Review Board's test diffs in the
testdata/diffs/unified directory, and results have been acceptable. I'm
just concerned that the format might vary as I haven't seen many
test cases yet.


Senior Undergraduate student, Indian Institute of Technology, Kharagpur

Attachment: signature.asc
Description: PGP signature

Reply via email to