> I don't think option 3 is baked in at indexing time. Sorry, I misread it. Yes, that is another option.
So if options 3 and 4 are about search-time selection (based on size and fieldname respectively) can they be generalized into a more wide-reaching retrieval API? You can imagine a high-level retrieval language like this: Select url, length(contents), substring(descr,0,50) ..where we have 3 items being returned. The first item (url) is a straight copy of the original field data, the second is the size in bytes of the "contents" field and the third is a summary of the "descr" field (in this case a simple substring but conceivably could be a more sophisticated summarizer eg the highlighter) If you think of each of these as retrieval functions we have an API that looks something like this: IndexReader.document(int doc, RetrieveFunction []retrievers); interface RetreiveFunction { Object getValue(FieldMetaData f); } interface FieldMetaData { String getFieldName() int getSize(); InputStream getInputStream(); } The reader calls the retrievers with a FieldMetaData object for each field and the data is only loaded from disk if a retrievefunction "bites" and asks for the InputStream to get the content for a field. You can imagine the different RetrieveFunction implementations could then not only choose which fields are returned but also how much content and in what format. I'm not sure if there is a sufficently long list of different retriever functions that would make this a useful approach. Cheers Mark Send instant messages to your online friends http://uk.messenger.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]