On Thu, 2002-02-14 at 06:42, Manfred Sch�fer wrote:
> Hi,
> 
> 
> > I think it's redundant to hardcode the indexing logic into all crawler component 
>(ftp, http, jdbc, filesys crawler). It's an interesting question how the components 
>can communicate? (don't you think using avalon is a good way?)
> 
> I've just had a look at avalon, and it looks promising.
> 
> As i've written before, i am thinking of three different component types: sources, 
>transformators and indexer(Lucene). I thought a little bit about a flexible way for 
>configuration of the indexing procedure and it seems that there could be many many 
>ways for combining sources, transformers and Lucene. What do you think about
> using a blackboard design pattern: Sources are producing records into a central 
>repostitory. Transformator are registering for records with a  special signature and 
>are getting these records for transformation. Finally, if nobody wants to transform a 
>record anymore, it is delivered to lucene.
> 

right, not sure I want to start out that way though.  Just adding
content handling and location abstraction etc is tough enough for the
first iteration... handling a inter-machine communication process makes
it more complex and achieves poorer performance on the smaller jobs
(compiling *hello world* would be slower on a beowulf cluster).  Once we
get the basic case we can grow out from there.

> btw: it would be nice, if indexing could be in sync with the indexed data. If files 
>were deleted, the index entries should also been deleted.
> 

On subsequent runs of the crawler yes..  That sounds good.  Not sure I'd
say that happens immediately (unless you're writing a filesystem driver
or something)

> regards,
> 
> manfred
> 
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
> 
-- 
http://www.superlinksoftware.com
http://jakarta.apache.org - port of Excel/Word/OLE 2 Compound Document 
                            format to java
http://developer.java.sun.com/developer/bugParade/bugs/4487555.html 
                        - fix java generics!
The avalanche has already started. It is too late for the pebbles to
vote.
-Ambassador Kosh


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to