> If your indexes change a lot sorting and merging two > ID lists (one from the index and one from the DB) in > Java space will certainly be the most effective solution.
I can't fit any of the lists into memory, but I could imagine maintaining an identifier mapping table; due to the fact that Lucene doesn't use stable identifiers this table would of course have to be rebuilt every time the index is updated... > That's the easy part. Once you have the document IDs you can always > retrieve the data from the DB (if you paginate the results you even > only have to retrieve the data page after page). It is possible that after paginating through the results a user decides to download all matched documents, completely. The only strategy I have been able to come up with that performs acceptably is to load objects as blobs from a single table and use other tables only for restricting the result set. Full text query results obviously don't fit in well, unless I store them into a temporary table. Unfortunately getting them into the temporary table isn't very efficient, especially for those cases when no more than the first ten items are ever viewed anyways. > This is the hard part in any retrieval system. If the sorting > attributes are not too big I still would extract them with the IDs and > do the sorting in Java space. If they are really big the TMP table you > choose is certainly the best option. I would already be quite happy if results could be obtained sorted by identifier rather than relevance; I hope future versions of Lucene will be a bit more flexible here... -- Eric Jain _______________________________________________ sapdb.general mailing list [EMAIL PROTECTED] http://listserv.sap.com/mailman/listinfo/sapdb.general
