According to Geoff Hutchison:
> On Fri, 18 Jan 2002, Gilles Detillieux wrote:
> > dummy ResultList with all valid document IDs.  All it needs is a method
> > to call to get that list of docIDs - that's the part I need help with.
> 
> The reason I suggested in htsearch/main.cpp is that you could get this
> from the DocumentDB class if you're not in the parser. I guess htsearch
> could grab this for the parser as a callback, but this was why I was
> looking to skip the parser entirely.

OK, I've given this some more thought.  The db.docdb in 3.1.x is keyed by
encoded URLs, so to get a list of docIDs from it, you'd essentially need
to read in and decode every record in that database just to get at the
docIDs.  Wouldn't it be much quicker to get them from the db.docs.index,
which is keyed by docID in 3.1?  It's a smaller database, and you'd just
need to traverse the "cursor" part of it to get the list of keys.

> > dummy record into the db.words.db with a list of cooked-up WordRecords.
> > That would work, but it's not as clean as I'd like.
> 
> It could also be a pretty big record.

Yup, which is a big part of the reason this isn't as clean as I'd like.

> > combining * with and or or doesn't really get you anything, but it might
> > be nice to be able to do "* not foo".
> 
> True. But we should make sure to remember the balance--how much code will
> we add versus the utility of the feature. I see the utility of "return all
> matches, then sort, restrict, etc." I also see the utility of "* not
> foo," but I'm not sure it's as bulletproof. Should we pass the DocumentDB
> to the parser too?

Well, I'd say either we pass the doc_index filename to the parser,
which would only create a database instance and open the database if it
needs it, or we do the opening part in main() regardless and pass the
Database pointer to the parser.  I prefer the former.  Come to think of
it, the parser already does some "config" lookups, and I was going to
add a lookup for prefix_match_character anyway, so why not just lookup
doc_index too and forget adding extra stuff to pass.  You know, I just
may be able to code this thing after all.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to