I'm tearing into things, figuring out what I think the API direction is/should be... I'm having trouble writing a coherent message, so I'll just send this and see how you feel. In general, I think we need to make a bigger separation between what is 'core' and what are the building blocks for specific use cases.


CORE
================================
Fundamentally, Droids is a framework to keep a bunch of Workers processing Tasks.

"Core" components relate to keeping Workers on Task.

From the existing API, I think the following are "core"
  Queue
  Task
  Worker
  perhaps DelayTimer/Worker

Core should deal with all the threading issues related to managing the Tasks. All the ThreadPoolExecutor stuff.

Unless I'm missing something, I don't even see why Droid is an interface -- it appears to be the parent container for management logic. AbstractDroid introduces some shared logic. Is it just that makes the manager Runnable?

The javadoc for Droid run() says: "Invoke an instance of the worker used in the droid" but the behavior in HelloCrawler is that run() initializes everything and starts the workers. Is there a reason this needs to happen in its own Runnable instance?

It seems the 'core' would focus on things like ThreadPoolExecutor.

I don't see any need for the existing Core.java class -- is it just there to make spring configuration easier. This seems like poor design since it gives access to everything. In my view, each component should only have access to what it needs.

Is the existing Core.java just part of the Cli helper app?  In
  public void start(String name){
    Droid droid = getDroid(name);
    droid.run();
  }



COMPONENTS / Blocks?  other name?
=================================

Each Droid implementation would include the 'Core' plus a set of components wired together. From the existing API, the things that strike me as components are:

Protocol
  URL >> InputStream
Parsing
  InputStream >> Metadata
Handler?  Action?
  Metadata >> something
  (save to solr)
  (write to disk)


DROIDS
=========
We should deliver a few standard use cases where all the plumbing is hooked together:
1. simple web crawler
2. simple filesystem walker
3. IMAP walker





Reply via email to