On 07/10/13 23:49, Manuel Suárez Sánchez wrote:

1. scan the source, building a strongly-typed, immutable domain model


This point is basic to improve the project because now there aren´t a good
domain model and it´s very confused.

I think that the question comes down to granularity.

Here's one way that the two contrasting approach might work...

With the full model approach, the source would be scanned completed into a model before the document contents were analysed. Once the analysis was complete, then the reporting would start. The process flow would be course-grained. This would cut across the grain of the current Rat design.

With a message oriented architecture, the scanner would send each document to enrichment as soon as it was created. The enricher would take a look at the contents and add document-level meta-data, then pass on the enriched object as soon as it was created. Aggregate analysers would then build up the report. This would be sympathetic to the current Rat design.

Retaining a streaming/messaging architecture means modelling at the message level (rather than more complete structures)

<snip>

However, I think that the current streaming design isn't particularly
intuitive or obvious. I would be happy to retain an improved streaming
design.


I think that apache rat is a release audit tool, focused on licenses. In
the project you analyse a file(audio) and you get the license of the file. Why
do you try to use streaming/message driven architecture?

Performance at small memory footprint

Robert

Reply via email to