Re: Output Semantics

Robert Burrell Donkin Mon, 16 Mar 2009 14:23:00 -0700

Jochen Wiedmann wrote:
> Hi,
> 
> as you have probably noticed, I have created a new branch for
> experimenting with RAT. The reason for creating a branch was that I
> found RAT's way of emitting output plainly confusing, at least to me.
> I never fully understood the system with "subject", "predicate", and
> "object". In particular, it was never clear to me, how "header
> sample", "license family", and so on relate. Apart from that, RAT-14
> strongly asked for a semantically richer output than basically a table
> with three columns.


RDF is surprisingly powerful (but the streaming design was a mistake),
and the power lies in the loose coupling between concepts. probably a
meta-data store design would have been better (and easier to understand).

> I have now (partially) resolved this in a way that satisfies me (but
> possibly others not as well): The output is now a series of "IClaim"
> objects with a class hierarchy that provides the semantical
> information. In particular (resolving RAT-14), running RAT will now
> result in the creation of a "ClaimStatistic". This result can be
> viewed on
> 
>   
> https://svn.apache.org/repos/asf/incubator/rat/main/branches/rat-output-semantics/
>
> I would now like to ask for confirmation to treat this as the base for
> RAT 0.7. As I do now have a more thorough understanding, I should as
> well be able to roll back most of my changes and create the
> "ClaimStatistic" with comparatively minor changes. However, my feeling
> is that others would share my problems in the future.

one worry i had about non-streaming approaches is that they're not easy
to use with big data sets (for example, scanning all the source in the
incubator) since all the data needs to be in before the report can be
produced

but i haven't found much time for RAT so feel free to take the design in
whatever direction you want. experience with scan is that the code reuse
has turned out to be limited.

> If noone else intervenes, then I'd move the current trunk to
> "branches/apache-rat-project-0.6" and my private branch to the trunk.
> I'd also like to use the "ClaimStatistics" to create a set of
> so-called policies. Policies would be simple plugins for the RAT Maven
> Plugin, which allow to configure the required behaviour quite easily.
> Typical policies might be "only ASL files", "only approved licenses",
> "at most 3 unknown files", and so on. This allows projects to
> integrate RAT into their standard build, refusing the build, if the
> policy isn't met.

i introduced the semantic stuff to handle policy

IMHO the right way to approach policies is through ontologies and RDF.
the problem is that there are only so many ways to handle the first
order logic that's required to solve this in the general.

- robert

Re: Output Semantics

Reply via email to