An updated design doc is here:

http://opensource.rightnow.com/htdig4_refactor_design.pdf

Thanks

On Tue, 18 Oct 2005, Neal Richter wrote:

> 
> I've been lax in checking-in myself.
> 
> Anthony Arnone and I have started work on HtDig 4.0
> 
> Here is a blog that Anthony has been keeping on Htdig 4.0 development.
> 
> http://htdig.blogspot.com/
> 
> There is a new branch in CVS.
> 
> http://cvs.sourceforge.net/viewcvs.py/htdig/htdig/?only_with_tag=htdig_4_0
> 
> This is an older design document.. I'll get an updated one put on the blog 
> ASAP.
> http://opensource.rightnow.com/htdig4_refactor_design.pdf
> 
> Basically the idea is to rip out the existing word-index and searching 
> code and replace it with CLucene while preserving as much of htdig 
> configurability as possible.  The function of the spider will be nearly 
> unchanged.  The db.doc.index will still exist, but that's the only thing 
> Berkeley DB will be used for.
> 
> I've removed the hacked version of BDB in 4.0 CVS.
> 
> What do we do about 3.2?  My vote is to call it 'final', update the 
> website and move forward.  I could do this, and have posted this thought 
> in the past.. no consensus emerged and I have no desire to be 
> heavy-handed.
> 
> After having looked at many commercial implementation of search engines 
> over the past few years and following Nutch a bit.. I am still convinced 
> that HtDig has plenty of legs.
> 
> 3.2 has become a road-block to progress.  We know it has issues, and 
> various people have made valiant efforts to address them.  From working 
> with the 'general' list some, plenty of users try moving to 3.2 then move 
> back to 3.1.6.
> 
> On the other hand people, like Christopher Murtagh and myself have used it 
> as a cog in a larger application.
> 
> My thought process for 4.0 is to get the htdig developers to concentrate 
> on building an application for web-servers rather than trying to do it all 
> and maintain the inverted index code... the Lucene community has already 
> cracked that nut.
> 
> Maybe this will get development kick-started again, since it's 100% 
> obvious that we're all not interested in furthering the current 3.2 code 
> for whatever reason.
> 
> Thanks.
> 
> On Sat, 15 Oct 2005, Gustave Stresen-Reuter wrote:
> 
> > It's been pretty quiet on the list lately. Is the party over?
> > 
> > Ted
> 
> 

-- 
Neal Richter
Sr. Researcher and Machine Learning Lead
Software Development
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485



-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
ht://Dig Developer mailing list:
htdig-dev@lists.sourceforge.net
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to