Hi,
> I think it's redundant to hardcode the indexing logic into all crawler component >(ftp, http, jdbc, filesys crawler). It's an interesting question how the components >can communicate? (don't you think using avalon is a good way?) I've just had a look at avalon, and it looks promising. As i've written before, i am thinking of three different component types: sources, transformators and indexer(Lucene). I thought a little bit about a flexible way for configuration of the indexing procedure and it seems that there could be many many ways for combining sources, transformers and Lucene. What do you think about using a blackboard design pattern: Sources are producing records into a central repostitory. Transformator are registering for records with a special signature and are getting these records for transformation. Finally, if nobody wants to transform a record anymore, it is delivered to lucene. btw: it would be nice, if indexing could be in sync with the indexed data. If files were deleted, the index entries should also been deleted. regards, manfred -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
