searching only part of an index
Hi I wondered if anyone knows whether it is possible to search ONLY the 100 (or whatever) most recently added documents to a lucene index? I know that once I have all my results ordered by ID number in Hits I could then just display the required amount, but I wondered if there is a way to avoid searching all documents in the index in the first place? Many thanks Alan _ Express yourself with cool new emoticons http://www.msn.co.uk/specials/myemo - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: searching only part of an index
You may be able to jimmy the bi filter to produce the most recent 100, but really keeping your fetch count at 100 and ordering by DOC should be sufficient. -Original Message- From: Alan Smith [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 27, 2004 4:03 PM To: [EMAIL PROTECTED] Subject: searching only part of an index Hi I wondered if anyone knows whether it is possible to search ONLY the 100 (or whatever) most recently added documents to a lucene index? I know that once I have all my results ordered by ID number in Hits I could then just display the required amount, but I wondered if there is a way to avoid searching all documents in the index in the first place? Many thanks Alan _ Express yourself with cool new emoticons http://www.msn.co.uk/specials/myemo - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: searching only part of an index
If you know the id of the last document in the index. (I don't know what's the best way to get it) you could probably use a range query. something like find all docs with the id in [lastId-100 TO lastID]. maybe you should make sure that the first limit is non negative, though. just a thought ioan At 08:02 AM 4/27/2004, you wrote: Hi I wondered if anyone knows whether it is possible to search ONLY the 100 (or whatever) most recently added documents to a lucene index? I know that once I have all my results ordered by ID number in Hits I could then just display the required amount, but I wondered if there is a way to avoid searching all documents in the index in the first place? Many thanks Alan _ Express yourself with cool new emoticons http://www.msn.co.uk/specials/myemo - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: searching only part of an index
Are the DOC ids sequential? Or just unique and ascending, I'm thinking like a good little Oracle boy, so does anyone know? -Original Message- From: Ioan Miftode [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 27, 2004 4:55 PM To: Lucene Users List Subject: Re: searching only part of an index If you know the id of the last document in the index. (I don't know what's the best way to get it) you could probably use a range query. something like find all docs with the id in [lastId-100 TO lastID]. maybe you should make sure that the first limit is non negative, though. just a thought ioan At 08:02 AM 4/27/2004, you wrote: Hi I wondered if anyone knows whether it is possible to search ONLY the 100 (or whatever) most recently added documents to a lucene index? I know that once I have all my results ordered by ID number in Hits I could then just display the required amount, but I wondered if there is a way to avoid searching all documents in the index in the first place? Many thanks Alan _ Express yourself with cool new emoticons http://www.msn.co.uk/specials/myemo - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: searching only part of an index
I think that if you include the indexing timestamp in the Document you create when indexing, you could sort on this and only pick the first 100. Regards, Terry - Original Message - From: Alan Smith [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, April 27, 2004 8:02 AM Subject: searching only part of an index Hi I wondered if anyone knows whether it is possible to search ONLY the 100 (or whatever) most recently added documents to a lucene index? I know that once I have all my results ordered by ID number in Hits I could then just display the required amount, but I wondered if there is a way to avoid searching all documents in the index in the first place? Many thanks Alan _ Express yourself with cool new emoticons http://www.msn.co.uk/specials/myemo - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: searching only part of an index
On Apr 27, 2004, at 9:00 AM, Nader S. Henein wrote: Are the DOC ids sequential? Or just unique and ascending, I'm thinking like a good little Oracle boy, so does anyone know? They are unique and ascending. Gaps in id's exist when documents are removed, and then the id's are squeezed back to completely sequential with no holes during an optimize. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: searching only part of an index
So if Alan wants to limit it to the first 100 he can't really use a range search unless he can guarantee that the index is optimized after deletes, but then if his deletion rounds are anything like mine ( every 2 mins) then optimizing it at each delete will make searching the index really slow. Right? Nader -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 27, 2004 5:15 PM To: Lucene Users List Subject: Re: searching only part of an index On Apr 27, 2004, at 9:00 AM, Nader S. Henein wrote: Are the DOC ids sequential? Or just unique and ascending, I'm thinking like a good little Oracle boy, so does anyone know? They are unique and ascending. Gaps in id's exist when documents are removed, and then the id's are squeezed back to completely sequential with no holes during an optimize. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: searching only part of an index
On Apr 27, 2004, at 9:49 AM, Nader S. Henein wrote: So if Alan wants to limit it to the first 100 he can't really use a range search unless he can guarantee that the index is optimized after deletes, but then if his deletion rounds are anything like mine ( every 2 mins) then optimizing it at each delete will make searching the index really slow. Right? Well, if you know how many you've deleted, then a range would work :) (number of docs in index minus 100 minus number deleted = starting range for doc id) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: searching only part of an index
On Apr 27, 2004, at 10:24 AM, Erik Hatcher wrote: On Apr 27, 2004, at 9:49 AM, Nader S. Henein wrote: So if Alan wants to limit it to the first 100 he can't really use a range search unless he can guarantee that the index is optimized after deletes, but then if his deletion rounds are anything like mine ( every 2 mins) then optimizing it at each delete will make searching the index really slow. Right? Well, if you know how many you've deleted, then a range would work :) (number of docs in index minus 100 minus number deleted = starting range for doc id) On second thought - this is incorrect - my apologies. To be clever, you'd have to know in what positions the deleted documents were in and account for them in that manner. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]