[ https://issues.apache.org/jira/browse/LUCENE-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874445#action_12874445 ]
Otis Gospodnetic commented on LUCENE-2425: ------------------------------------------ Karthick, it looks like your May 1st comment ended with "The split policies under development include:", but without the actual list of those policies. > An Anti-Merging Multi-Directory Indexing Framework > -------------------------------------------------- > > Key: LUCENE-2425 > URL: https://issues.apache.org/jira/browse/LUCENE-2425 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/*, Index > Affects Versions: 3.0.1 > Reporter: Karthick Sankarachary > Attachments: LUCENE-2425.patch > > > By design, a Lucene index tends to merge documents that span multiple > segments into fewer segments, in order to optimize its directory structure, > which in turn leads to better search performance. In particular, it relies on > a merge policy to specify the set of merge operations that should be > performed when the index is optimized. > Often times, there's a need to do the exact opposite, which is to "split" the > documents. This calls for a mechanism that facilitates sub-division of > documents based on a certain (ideally, user-defined) algorithm. By way of > example, one may wish to sub-divide (or partition) documents based on > parameters such as time, space, real-timeliness, and so on. Herein, we > describe an indexing framework that builds on the Lucene index writer and > reader, to address use cases wherein documents need to diverge rather than > converge. > In brief, it associates zero or more sub-directories with the index's > directory, which serve to complement it in some manner. The sub-directories > (a.k.a. splits) are managed by a split policy, which is notified of all > changes made to the index directory (a.k.a. super-directory), thus allowing > it to modify its sub-directories as it sees fit. To make the index reader and > writer "observable", we extend Lucene's reader and writer with the goal of > providing hooks into every method that could potentially change the index. > This allows for propagation of such changes to the split policy, which > essentially acts as a listener on the index. > We refer to each sub-directory (or split) and the super-directory as a > sub-index of the containing index (a.k.a. the split index). Note that the > sub-directory may not necessarily be co-located with the super-directory. > Furthermore, the split policy in turn relies on one or more split rules to > determine when to add or remove sub-directories. This allows for a clear > separation of the event that triggers a split from the management of those > splits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org