[ 
https://issues.apache.org/jira/browse/RAT-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marija Sljivovic updated RAT-45:
--------------------------------

    Attachment: apache-rat-pd-0.02.zip

Project structure is changed by some of recommendations which can be found at: 
http://issues.apache.org/jira/browse/RAT-45, but still there is space for 
improvements.
Basic support for reading whole directories of source files is added .
Maven support is still unfinished. There are still several things waiting to be 
written by recommendations. ISearchEngine interface is still the same but new 
implementation of GoogleCodeSearchParser is in development phase. It will be 
use Google Code Search API.
More about it 
on:http://wiki.apache.org/general/MarijaSljivovic/SoC2009ApacheRatProposal

> Apache RAT copy&paste detector - tool for detecting copied(plagiarised) code 
> by searching on web code search engines
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: RAT-45
>                 URL: https://issues.apache.org/jira/browse/RAT-45
>             Project: RAT
>          Issue Type: New Feature
>         Environment: This improvements of Apache RAT tool will be written in 
> Java.
> Requirements: OS with RE already installed on  and Internet connection
>            Reporter: Marija Sljivovic
>         Attachments: apache-rat-pd-0.02.zip, copyandpaste.zip, 
> copyandpastedetector-src-0.01.zip
>
>   Original Estimate: 2688h
>  Remaining Estimate: 2688h
>
> This document is about implementing new tool which will be included in Apache 
> RAT project.
> Original idea: http://wiki.apache.org/general/SummerOfCode2009#rat-project
> Aim is to create working, modular, configurable command-line tool
> for searching the web based code search  engines for possible plagiarised 
> code in our code bases.
> Tool will be heuristic in nature. It will make guesses about code parts.
> If it decide that code is good-to-be-copy&pasted, it will check if there is 
> matching code on code search engines.
> This part of code will be stored in report if any  match is found.
> Man who read this report will decide about is code really copied or it is not.
> Algorithm which will be in base of this tool is variant of sliding-window 
> algorithm.
> Current code parts which algorithm generate will be checked by different 
> heuristic methods and optionally
> will be sent to some code search engine for checking.
> More information and ideas about this project can be found here:
> http://wiki.apache.org/general/MarijaSljivovic/SoC2009ApacheRatProposal

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to