On Jun 28, 2022, at 8:51 AM, David Erlandson <david.erland...@rice.edu> wrote:

> I have a colleague who is looking to track changes in text of a manuscript 
> that has 4 revisions. Apparently there are pretty major changes to the 
> content and it would be great to identify them.
> 
> I was thinking through tools I'm familiar with (generally line by line 
> comparisons) but that would seem to have the pitfall of an early large 
> revision throwing off the comparison for the rest of the text. Another silly 
> thought was to start up a local wiki instance and overlay each version; use 
> the built in compare tools... Has anyone worked on a project like this?  Or 
> are there any tools built and ready to go? Any guidance would be appreciated.


If I understand the question correctly, then I believe you need to do what is 
sometimes called "collocation", and I used a JavaScript library to accomplish a 
similar task. The library is called TRAViz [1].

More specifically, I had two sets of files, and each set was a translation the 
Psalms. One translated in 1610 and the other translated in 1700. [2] I wanted 
to see how each translation was similar and different. Each file in each set 
was similarly named. I then wrote a Python script that loops through the 
translations and outputs an HTML file. [3] The HTML file is highly structured, 
calls TRAViz, and outputs a visualization illustrating where two translations 
differed and converged. You can temporarily see the results of these labors 
online, but be forewarned because TRAViz is doing a lot of work against many 
paragraphs. Rendering is slow. [4] 

HTH

[1] TRAViz - http://www.traviz.vizcovery.org
[2] Psalms - http://dh.crc.nd.edu/tmp/collocations/psalms/
[3] Python script - http://dh.crc.nd.edu/tmp/collocations/bin/psalms2html.py
[4] results - http://dh.crc.nd.edu/tmp/collocations/html/

--
Eric Lease Morgan
Navari Family Center for Digital Scholarship
Hesburgh Libraries
University of Notre Dame

574/631-8604
https://cds.library.nd.edu

Reply via email to