searching only part of an index

2004-04-27 Thread Alan Smith
Hi

I wondered if anyone knows whether it is possible to search ONLY the 100 (or 
whatever) most recently added documents to a lucene index? I know that once 
I have all my results ordered by ID number in Hits I could then just display 
the required amount, but I wondered if there is a way to avoid searching all 
documents in the index in the first place?

Many thanks

Alan

_
Express yourself with cool new emoticons http://www.msn.co.uk/specials/myemo
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: searching only part of an index

2004-04-27 Thread Nader S. Henein
You may be able to jimmy the bi filter to produce the most recent 100, but
really keeping your fetch count at 100 and ordering by DOC should be
sufficient.

-Original Message-
From: Alan Smith [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 27, 2004 4:03 PM
To: [EMAIL PROTECTED]
Subject: searching only part of an index


Hi

I wondered if anyone knows whether it is possible to search ONLY the 100 (or

whatever) most recently added documents to a lucene index? I know that once 
I have all my results ordered by ID number in Hits I could then just display

the required amount, but I wondered if there is a way to avoid searching all

documents in the index in the first place?

Many thanks

Alan

_
Express yourself with cool new emoticons http://www.msn.co.uk/specials/myemo


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: searching only part of an index

2004-04-27 Thread Ioan Miftode


If you know the id of the last document in the index.
(I don't know what's the best way to get it)
you could probably use a range query.
something like find all docs with the id in [lastId-100 TO lastID].
maybe you should make sure that the first limit is non negative, though.
just a thought

ioan

At 08:02 AM 4/27/2004, you wrote:
Hi

I wondered if anyone knows whether it is possible to search ONLY the 100 
(or whatever) most recently added documents to a lucene index? I know that 
once I have all my results ordered by ID number in Hits I could then just 
display the required amount, but I wondered if there is a way to avoid 
searching all documents in the index in the first place?

Many thanks

Alan

_
Express yourself with cool new emoticons http://www.msn.co.uk/specials/myemo
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: searching only part of an index

2004-04-27 Thread Nader S. Henein
Are the DOC ids sequential? Or just unique and ascending, I'm thinking like
a good little Oracle boy, so does anyone know?

-Original Message-
From: Ioan Miftode [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 27, 2004 4:55 PM
To: Lucene Users List
Subject: Re: searching only part of an index




If you know the id of the last document in the index.
(I don't know what's the best way to get it)
you could probably use a range query.
something like find all docs with the id in [lastId-100 TO lastID]. maybe
you should make sure that the first limit is non negative, though.

just a thought

ioan

At 08:02 AM 4/27/2004, you wrote:
Hi

I wondered if anyone knows whether it is possible to search ONLY the 
100
(or whatever) most recently added documents to a lucene index? I know that 
once I have all my results ordered by ID number in Hits I could then just 
display the required amount, but I wondered if there is a way to avoid 
searching all documents in the index in the first place?

Many thanks

Alan

_
Express yourself with cool new emoticons 
http://www.msn.co.uk/specials/myemo


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: searching only part of an index

2004-04-27 Thread Terry Steichen
I think that if you include the indexing timestamp in the Document you
create when indexing, you could sort on this and only pick the first 100.

Regards,

Terry
- Original Message - 
From: Alan Smith [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Tuesday, April 27, 2004 8:02 AM
Subject: searching only part of an index


 Hi

 I wondered if anyone knows whether it is possible to search ONLY the 100
(or
 whatever) most recently added documents to a lucene index? I know that
once
 I have all my results ordered by ID number in Hits I could then just
display
 the required amount, but I wondered if there is a way to avoid searching
all
 documents in the index in the first place?

 Many thanks

 Alan

 _
 Express yourself with cool new emoticons
http://www.msn.co.uk/specials/myemo


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: searching only part of an index

2004-04-27 Thread Erik Hatcher
On Apr 27, 2004, at 9:00 AM, Nader S. Henein wrote:
Are the DOC ids sequential? Or just unique and ascending, I'm thinking 
like
a good little Oracle boy, so does anyone know?
They are unique and ascending.

Gaps in id's exist when documents are removed, and then the id's are 
squeezed back to completely sequential with no holes during an 
optimize.

	Erik

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: searching only part of an index

2004-04-27 Thread Nader S. Henein
So if Alan wants to limit it to the first 100 he can't really use a range
search unless he can guarantee that the index is optimized after deletes,
but then if his deletion rounds are anything like mine ( every 2 mins) then
optimizing it at each delete will make searching the index really slow.
Right?

Nader

-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 27, 2004 5:15 PM
To: Lucene Users List
Subject: Re: searching only part of an index


On Apr 27, 2004, at 9:00 AM, Nader S. Henein wrote:
 Are the DOC ids sequential? Or just unique and ascending, I'm thinking
 like
 a good little Oracle boy, so does anyone know?

They are unique and ascending.

Gaps in id's exist when documents are removed, and then the id's are 
squeezed back to completely sequential with no holes during an 
optimize.

Erik


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: searching only part of an index

2004-04-27 Thread Erik Hatcher
On Apr 27, 2004, at 9:49 AM, Nader S. Henein wrote:
So if Alan wants to limit it to the first 100 he can't really use a 
range
search unless he can guarantee that the index is optimized after 
deletes,
but then if his deletion rounds are anything like mine ( every 2 mins) 
then
optimizing it at each delete will make searching the index really slow.
Right?
Well, if you know how many you've deleted, then a range would work :)  
(number of docs in index minus 100 minus number deleted = starting 
range for doc id)

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: searching only part of an index

2004-04-27 Thread Erik Hatcher
On Apr 27, 2004, at 10:24 AM, Erik Hatcher wrote:
On Apr 27, 2004, at 9:49 AM, Nader S. Henein wrote:
So if Alan wants to limit it to the first 100 he can't really use a 
range
search unless he can guarantee that the index is optimized after 
deletes,
but then if his deletion rounds are anything like mine ( every 2 
mins) then
optimizing it at each delete will make searching the index really 
slow.
Right?
Well, if you know how many you've deleted, then a range would work :)  
(number of docs in index minus 100 minus number deleted = starting 
range for doc id)
On second thought - this is incorrect - my apologies.  To be clever, 
you'd have to know in what positions the deleted documents were in and 
account for them in that manner.

	Erik

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]