Here are the results from the first hacked version of adding CLucene. What I did was comment out the 'put()' calls to the db.words.db & db.excerpts and add calls to CLucene at the Retreiver.cc level to insert the documents.
I'm still verifying that I'm passing in the same amount of information to both.
Searching TBA.
Thanks
------------------- compressed -----------
real 5m31.248s user 1m14.760s sys 0m6.860s
284k webindex/db.docdb 92k webindex/db.docs.index 5.3M webindex/db.excerpts 60M webindex/db.words.db 4.0k webindex/luceneidx
Total Size: ~ 66MG
------------------- lucene ----------- real 2m31.635s user 1m57.850s sys 0m25.390s
284k webindex/db.docdb 92k webindex/db.docs.index 16k webindex/db.excerpts 4.0k webindex/db.words.db 22M webindex/luceneidx
Total Size ~ 22 MG
--
Neal Richter Knowledgebase Developer
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485
------------------------------------------------------- This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting Tool for open source databases. Create drag-&-drop reports. Save time by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. Download a FREE copy at http://www.intelliview.com/go/osdn_nl _______________________________________________ ht://Dig Developer mailing list: htdig-dev@lists.sourceforge.net List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-dev