Re: [racket] in praise of if's mandatory else clause

Neil Van Dyke Mon, 30 May 2011 17:05:10 -0700

Hendrik Boom wrote at 05/30/2011 06:58 PM:

On Mon, May 30, 2011 at 04:58:00PM -0400, Neil Van Dyke wrote:
* Do very expensive farming of system to detect places where programmersdid copy&paste reuse, when for maintainability (and perhaps codefootprint) we'd prefer that the code be generalized. I'm pretty surethat there is a programming practice that involves the train of thought"this problem A is similar to problem B that I have seen before, so Iwill copy the code for A and modify it to do B", and some programmers dothis a lot more than others do. The funniest I've seen was aconstruction, "(if BOOLEAN-VARIABLE HUGE-BLOCK-OF-CODE-1HUGE-BLOCK-OF-CODE-2)", where ediff eventually showed that the two hugeblocks of code differed only a single Boolean constant, equal to"BOOLEAN-VARIABLE". More commonly, this takes the form of a copy&pastedprocedure within the same module, multiple definitions from one modulepasted into another (which may not be modified), or an entire modulecloned as a starting point. A checking tool for this would also beuseful for identifying generalization opportunities throughout code thatwasn't copy&paste'd, such as two procedures that coincidentally turnedout almost the same, or a code pattern that is used widely and could bea macro. I think there's a PhD in there, unless it's already beenmostly done.
Have a look at Dick Grune's (www.dickgrune.com) similarity tester(http://www.dickgrune.com/Programs/similarity_tester/).

Thanks. If I read correctly, I think this paper describes a heuristicsimilarity metric, crafted to detect copying of small introductorystudent programming assignments.

I imagine that a rough similarity metric like this might be used tospeed up more expensive precise partial structural matching of chunks ofcode in large systems, to first find promising-looking general areas totarget for the more expensive matching. I think that the expensivestructural matching is necessary, so that you could generate completesuggested code improvements programmatically, and also to weed out somefalse-positives found by your heuristic.

One exercise that I would find interesting is to look at examples of``duplicate'' code in corpora of real-world software systems, and try tocharacterize those examples in a way useful for crafting this fast metric.


--
http://www.neilvandyke.org/

_________________________________________________
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/users

Re: [racket] in praise of if's mandatory else clause

Reply via email to