TeeSinkCodec and FilteringCodec
-------------------------------

                 Key: LUCENE-2632
                 URL: https://issues.apache.org/jira/browse/LUCENE-2632
             Project: Lucene - Java
          Issue Type: New Feature
          Components: Index
    Affects Versions: 4.0
            Reporter: Andrzej Bialecki 


This issue adds two new Codec implementations:

* TeeSinkCodec: there have been attempts in the past to implement parallel 
writing to multiple indexes so that they are all synchronized. This was however 
complicated due to the complexity of IndexWriter/SegmentMerger logic. The 
solution presented here offers a similar functionality but working on a 
different level - as the name suggests, the TeeSinkCodec duplicates term data 
into multiple output Directories, and provides a multi-directory abstraction to 
perform other operations that are not yet handled by the Codec API (e.g. stored 
fields handling).

* FilteringCodec is related in a remote way to the ideas of index pruning 
presented in LUCENE-1812 and the concept of tiered search. Since we can use 
TeeSinkCodec to write to multiple output Directories in a synchronized way, we 
could also filter out or modify some of the data that is being written. The 
FilteringCodec provides this functionality, so that you can use like this:
{code}
IndexWriter --> TeeSinkCodec
                 |  |
                 |  +--> StandardCodec --> Directory1
                 +--> FilteringCodec --> StandardCodec --> Directory2
{code}

The end result of this chain is two indexes that are kept in sync - one is the 
full regular index, and the other one is a filtered index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to