Re: [GSOC] Rat: Past, Present and Future

Robert Burrell Donkin Thu, 11 Jul 2013 12:50:49 -0700

On 07/10/13 23:49, Manuel Suárez Sánchez wrote:


1. scan the source, building a strongly-typed, immutable domain model



This point is basic to improve the project because now there aren´t a good
domain model and it´s very confused.


I think that the question comes down to granularity.

Here's one way that the two contrasting approach might work...

With the full model approach, the source would be scanned completed intoa model before the document contents were analysed. Once the analysiswas complete, then the reporting would start. The process flow would becourse-grained. This would cut across the grain of the current Rat design.

With a message oriented architecture, the scanner would send eachdocument to enrichment as soon as it was created. The enricher wouldtake a look at the contents and add document-level meta-data, then passon the enriched object as soon as it was created. Aggregate analyserswould then build up the report. This would be sympathetic to the currentRat design.

Retaining a streaming/messaging architecture means modelling at themessage level (rather than more complete structures)


<snip>

However, I think that the current streaming design isn't particularly

intuitive or obvious. I would be happy to retain an improved streaming
design.



I think that apache rat is a release audit tool, focused on licenses. In
the project you analyse a file(audio) and you get the license of the file. Why
do you try to use streaming/message driven architecture?


Performance at small memory footprint

Robert

Re: [GSOC] Rat: Past, Present and Future

Reply via email to