This is an automated email from the ASF dual-hosted git repository.

wmb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/drat.wiki.git


The following commit(s) were added to refs/heads/master by this push:
     new f7480a6  Created Mission Statement (markdown)
f7480a6 is described below

commit f7480a65bdea1df7e7308e4584b9d9cfa434fe00
Author: Wayne Moses Burke <[email protected]>
AuthorDate: Thu Feb 1 14:27:15 2018 -0500

    Created Mission Statement (markdown)
---
 Mission-Statement.md | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/Mission-Statement.md b/Mission-Statement.md
new file mode 100644
index 0000000..b349d56
--- /dev/null
+++ b/Mission-Statement.md
@@ -0,0 +1,19 @@
+**What does DRAT stand for?**
+
+Distributed Release Audit Tool - based on the shoulders of Apache Creadur's 
Release Audit Tool (RAT) this project tries to scale out license checks on a 
large scale.
+
+**What does DRAT want?**
+
+The Distributed Release Audit Tool (DRAT) improves over the Apache RAT code 
audit tool in several ways.
+RAT is a command line tool and Java API and Maven plugin that audits a code 
base and its declared OSS licenses - if you say it's Apache2, RAT will check 
whether or not your source is Apache2 and produce a report that states what 
files are/aren't and why. RAT has several problems, namely:
+
+* It doesn't scale to large code bases - running it on a 25k file and 10M LOC 
code base ran for ~4 weeks on a normal Linux server with 5GB memory and tons of 
hard disk and modern CPUs.
+* RAT's crawler is rudimentary and you have to use explicit white/black lists 
on what files to avoid or else it will be checking binary files for licenses.
+* RAT doesn't produce incremental output. It either completes and generates a 
log, or it doesn't.
+
+DRAT improves upon RAT in several ways namely by addressing all of the above 
concerns.
+DRAT is a Map Reduce version of RAT using Apache Tika to automatically sort 
and classify the code base files; Apache OODT to index metadata and Tika 
information about those code files into Apache Solr; and OODT to produce a Map 
Reduce workflow that runs RAT incrementally on k-sized chunks of 
same-MIME-typed files (detected by Tika) and then producing incremental, per 
type logs, and then aggregating and reducing them into a combined log at the 
end.
+
+**What's the status of the project?**
+
+As of September 2017 the project was granted top-level status after being 
developed for a while on Github.

-- 
To stop receiving notification emails like this one, please contact
[email protected].

Reply via email to