There are lots of softwares written to identify similar codes, and they do
much more than what you're suggesting.

For example they make a graph of instructions which must necessarily comme
before others like:

a = 2;
b = 3;

It doesn't matter which instruction comes first, so a code like
b = 3;
a = 2;

would match.

But,
a = 2;
b = a + 3;

can't switch order.

The algorithm to implement 3 is very easy, you just normalizate all codes
before comparing. When you find the first variable maps it to 'var0', the
second to 'var1' etc... every time you find it again you use the same name.

Compilers always need to do these jobs.

The algorithm to implement4 is very easy too, just tokenize your source code
by functions and store all of them with one ID of the origin source code.
and use any kind of search technique to match strings.

So, the difficult is to parse the source code based in the language sintaxe.
For GCJ that's even harder because the programming languages are not
restricted.

On Tue, May 18, 2010 at 2:58 PM, Paul Smith <[email protected]> wrote:

> Someone did this last year I think, I forget who though.
>
> On Tuesday, May 18, 2010, Pakku <[email protected]> wrote:
> > Hello friends,
> >
> > How about writing an application to identify the cheaters in gcj
> >
> > 1. Comparing the n different source files where n is the number of
> > participants in gcj to find duplicate source file. [ should be
> > possible ]
> > 2. Comparing the source files by ignoring the comments. [ should be
> possible ]
> > 3. Comparing the source files by ignoring the variable names. I mean 2
> > source files are same except that the variables used are different.
> > [should be possible, but difficult to implement ]
> > 4. Comparing souces files to identify if just 1 or more functions are
> > same. [should be possible, but difficult to implement ]
> >
> > I would like to find out how can we actually program for 3 & 4.
> >
> > Wonder if google already has this application handy with them.
> >
> > regards,
> >  prakash
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> "google-codejam" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> [email protected]<google-code%[email protected]>
> .
> > For more options, visit this group at
> http://groups.google.com/group/google-code?hl=en.
> >
> >
>
> --
> Paul Smith
>
> [email protected]
>
> --
> You received this message because you are subscribed to the Google Groups
> "google-codejam" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<google-code%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-code?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"google-codejam" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-code?hl=en.

Reply via email to