> you can do that by subclassing IW and call some package private APIs /


To date I have used separate physical indexes with a MultiReader to combine 
them then dropping the outdated indexes.
At least this has the benefit that a custom MergePolicy is not required to keep 
content from the different dates segregated.

Where I saw the potential is when looking at S4 or Esper stream processing 
technologies when they try to count things in time windows.
It struck me that careful organisation of Lucene segments along time units 
could provide an efficient means of accessing and comparing counts of many 
things over time.
It looked like the "Hello World' example in S4 for counting top Twitter topics 
instantiated a Java object per unique topic String which was then responsible 
for maintaining counts on things - this seems a fairly inefficient way of 
modelling things.

>>If you are willing/able to close the IndexWriter, it's easy to drop segments 
>>by reading the SegmentInfos, editing, and writing back.

My assumption was that ultimately that's what it comes down to - I just wonder 
if this is likely to be a common requirement, deserving of a supported API



> members. We can certainly make that easier but I personally don't want
> to open this as a public API. I can certainly imagine to have a
> protected API that allows dropping entire segment.
>
> simon
>
>> c) Various new analysis functions comparing term frequencies across time e.g 
>> discovery of "trending" topics.
>>
>> I can see that a) could be implemented using a custom MergePolicy and c) can 
>> be done via existing APIs but I'm not sure if there is way to simply drop 
>> entire segments currently?
>>
>> Anyone else had thoughts in this area?

I had some ideas to add statistics to DocValues that get created
during index time. You can already do that and expose it via
Attributes maybe we can add some API to docvlaues you can hook into so
that you don't need to write you own DV impl.
>>
>> Cheers
>> Mark
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to