This is an automated email from the ASF dual-hosted git repository.
wmb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/drat.wiki.git
The following commit(s) were added to refs/heads/master by this push:
new f7480a6 Created Mission Statement (markdown)
f7480a6 is described below
commit f7480a65bdea1df7e7308e4584b9d9cfa434fe00
Author: Wayne Moses Burke <[email protected]>
AuthorDate: Thu Feb 1 14:27:15 2018 -0500
Created Mission Statement (markdown)
---
Mission-Statement.md | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/Mission-Statement.md b/Mission-Statement.md
new file mode 100644
index 0000000..b349d56
--- /dev/null
+++ b/Mission-Statement.md
@@ -0,0 +1,19 @@
+**What does DRAT stand for?**
+
+Distributed Release Audit Tool - based on the shoulders of Apache Creadur's
Release Audit Tool (RAT) this project tries to scale out license checks on a
large scale.
+
+**What does DRAT want?**
+
+The Distributed Release Audit Tool (DRAT) improves over the Apache RAT code
audit tool in several ways.
+RAT is a command line tool and Java API and Maven plugin that audits a code
base and its declared OSS licenses - if you say it's Apache2, RAT will check
whether or not your source is Apache2 and produce a report that states what
files are/aren't and why. RAT has several problems, namely:
+
+* It doesn't scale to large code bases - running it on a 25k file and 10M LOC
code base ran for ~4 weeks on a normal Linux server with 5GB memory and tons of
hard disk and modern CPUs.
+* RAT's crawler is rudimentary and you have to use explicit white/black lists
on what files to avoid or else it will be checking binary files for licenses.
+* RAT doesn't produce incremental output. It either completes and generates a
log, or it doesn't.
+
+DRAT improves upon RAT in several ways namely by addressing all of the above
concerns.
+DRAT is a Map Reduce version of RAT using Apache Tika to automatically sort
and classify the code base files; Apache OODT to index metadata and Tika
information about those code files into Apache Solr; and OODT to produce a Map
Reduce workflow that runs RAT incrementally on k-sized chunks of
same-MIME-typed files (detected by Tika) and then producing incremental, per
type logs, and then aggregating and reducing them into a combined log at the
end.
+
+**What's the status of the project?**
+
+As of September 2017 the project was granted top-level status after being
developed for a while on Github.
--
To stop receiving notification emails like this one, please contact
[email protected].