Re: Mixing database and lucene searches

Glen Stampoultzis Tue, 11 May 2004 06:28:19 -0700

"Eric Jain" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> > If you *really* don't want to (or can't) put all the searchable fields
> > into lucene, then you are going to need to do a "lucene-db" join.
>
> Here are two good reasons:
>
> 1. Range queries
> 2. Sorting
>
> Yes, Lucene can do both, but I find that in both cases the approach
> Lucene uses is not suitable for large data sets, given limited hardware
> resources.
>
>
> > Hits hits = searcher.search(new TermQuery("text", "foo")
> > Set hitPKs = new Set();
> > for each doc in hits:
> >    hitPKs.put(doc.getField("pk"))
>
> Retrieving even one custom field for every document of a possibly large
> data set
> can end up being very slow, it seems. This complicates things a lot...
>
> Unfortunately, I am not aware of any good solutions for combining Lucene
> with a relational database, given the requirements listed above.
> However, one promising approach may involve combing Lucene with the new
> Berkely DB JE:
>
> 1. Use Lucene to create a bitset of results (position = docid).
> 2. Use BDB to iterate through primary keys, sorted and restricted by one
> (or more?) of several criteria.
>    3. For each primary key, look up docid (this database must be rebuilt
> every time the index is modified).
>    4. If docid set in result bitset, report result.
>
> If anyone has tried anything similar, I'd be interested to know!


Why Berkely DB?  This sounds like it would work regardless of the database.

Regards,

Glen





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Mixing database and lucene searches

Reply via email to