There are a number of scenarios where Lucene might be used to index a fixed 
time range on a continuous stream of data e.g. a news feed.

In these scenarios I imagine the following facilities would be useful:

a) A MergePolicy that organized content into segments on the basis of 
increasing time units e.g. 5min->10 min->1 hour->1 day
b) The ability to drop entire segments e.g. the day-level segment from exactly 
a week ago 
c) Various new analysis functions comparing term frequencies across time e.g 
discovery of "trending" topics.

I can see that a) could be implemented using a custom MergePolicy and c) can be 
done via existing APIs but I'm not sure if there is way to simply drop entire 
segments currently?

Anyone else had thoughts in this area?

Cheers
Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to