I've been lax in checking-in myself.

Anthony Arnone and I have started work on HtDig 4.0

Here is a blog that Anthony has been keeping on Htdig 4.0 development.

http://htdig.blogspot.com/

There is a new branch in CVS.

http://cvs.sourceforge.net/viewcvs.py/htdig/htdig/?only_with_tag=htdig_4_0

This is an older design document.. I'll get an updated one put on the blog 
ASAP.
http://opensource.rightnow.com/htdig4_refactor_design.pdf

Basically the idea is to rip out the existing word-index and searching 
code and replace it with CLucene while preserving as much of htdig 
configurability as possible.  The function of the spider will be nearly 
unchanged.  The db.doc.index will still exist, but that's the only thing 
Berkeley DB will be used for.

I've removed the hacked version of BDB in 4.0 CVS.

What do we do about 3.2?  My vote is to call it 'final', update the 
website and move forward.  I could do this, and have posted this thought 
in the past.. no consensus emerged and I have no desire to be 
heavy-handed.

After having looked at many commercial implementation of search engines 
over the past few years and following Nutch a bit.. I am still convinced 
that HtDig has plenty of legs.

3.2 has become a road-block to progress.  We know it has issues, and 
various people have made valiant efforts to address them.  From working 
with the 'general' list some, plenty of users try moving to 3.2 then move 
back to 3.1.6.

On the other hand people, like Christopher Murtagh and myself have used it 
as a cog in a larger application.

My thought process for 4.0 is to get the htdig developers to concentrate 
on building an application for web-servers rather than trying to do it all 
and maintain the inverted index code... the Lucene community has already 
cracked that nut.

Maybe this will get development kick-started again, since it's 100% 
obvious that we're all not interested in furthering the current 3.2 code 
for whatever reason.

Thanks.

On Sat, 15 Oct 2005, Gustave Stresen-Reuter wrote:

> It's been pretty quiet on the list lately. Is the party over?
> 
> Ted

-- 
Neal Richter
Knowledgebase Developer
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485





-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
ht://Dig Developer mailing list:
htdig-dev@lists.sourceforge.net
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to