There are a number of scenarios where Lucene might be used to index a fixed time range on a continuous stream of data e.g. a news feed.
In these scenarios I imagine the following facilities would be useful: a) A MergePolicy that organized content into segments on the basis of increasing time units e.g. 5min->10 min->1 hour->1 day b) The ability to drop entire segments e.g. the day-level segment from exactly a week ago c) Various new analysis functions comparing term frequencies across time e.g discovery of "trending" topics. I can see that a) could be implemented using a custom MergePolicy and c) can be done via existing APIs but I'm not sure if there is way to simply drop entire segments currently? Anyone else had thoughts in this area? Cheers Mark --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
