: Is this the best way to do this? Is there a way to store location : information associated with each term within a field? Note that there can : be thousands of documents containing thousands of pages.
It depends on what's important to you. (FYI: i'm document with "file" in the rest of this mail, to avoid confusion with lucene's Document concept) if your goal is just to find files containing words and order those files by date, or some other fixed attribute of the file -- then making one Document per page is probably fine. If you really want to take advantage of score *per file* then indexing each page is going to make it hard for you to say which files matches a particular search 'best' Something to consider is a two pronged approach: one Document per file (with a special doctype:file field), and one Document per page (doctype:page). Initial searches can be BooleanQueries with a mandatory "doctype:file" clause, and then once you've got a list of fileIds sorted by score, do a secondary BooleanSearch on: your initial search, the fileIds from the first page of your results, "doctype:page". -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]