Thanks Phil for the response, I guess I was thinking of a debug report such
as:
Files Analyzed:19,543
Folders Analyzed:343
Total lines of code analyzed: 1,544,346
Total lines of code in source: 1,244,346
Total lines of code in destination: 1,944,346
Total lines with exact matches: 856,644
Unique lines in source: 400,546
Unique lines in destination: 850,546
Similarity of source to destination: 45%
Exact matches of greater than 25 contiguous lines of code: 943
Exact matches of greater than 5 contiguous lines of code: 46,733

I looked into the plagiarism-detector tools and haven't found anything yet
that does PHP, and the command line diff tools "should" be able to output
this type of report, I just figured that all of this info, with the
exception of the last 2 would be already tracked in the software and just
need to be output somewhere.

Alan

On Wed, Sep 27, 2017 at 4:14 PM, Phil Hord <[email protected]> wrote:

> Alan,
>
> Tools already exist that more directly meet your need.  Any unix-like
> system will have command-line tools to do most of this analysis.  I'd start
> with "diff -b -B -w", but you can also use "comm".  The comm tool relies on
> the files being sorted, though, so you might want to ignore "empty" lines
> or common lines like </head>, for example.
>
> There are some plagiarism-detector tools that may also help, but I don't
> have any experience with those.
>
> Feel free to contact me off-list if you need more specific guidance.
> Phil
>
>
> On Wed, Sep 27, 2017 at 2:49 PM Alan Halls <[email protected]> wrote:
>
>> I am involved in a legal matter regarding an employees theft of trade
>> secrets. In particular he stole the source code for a website that he and 2
>> other programmers worked on for 2 years.
>>
>> I now have a copy of his project, and of course a copy of mine. I found
>> the software Meld which seems to do a great job on a one by one basis, but
>> it would be very time consuming to try to end up with any "score" of how
>> much of our original code is still in his existing project.
>>
>> He was sloppy and his launched public website still has our company info
>> in the 404 page, which links you to the about us, pricing, docs, contact us
>> pages ---- which all still have the original code in them, so there is no
>> question about whether or not he did, just how much "custom" work did he do
>> for himself.
>>
>> I was kind of imagining a report with a total score, then the top 50
>> matches with each of their scores. Has anyone thought of adding that in? It
>> seems that all that info would be available already in the program, just
>> needing a view for it to display on.
>>
>> _______________________________________________
>> meld-list mailing list
>> [email protected]
>> https://mail.gnome.org/mailman/listinfo/meld-list
>
>
_______________________________________________
meld-list mailing list
[email protected]
https://mail.gnome.org/mailman/listinfo/meld-list

Reply via email to