Hi, In our application, we have a requirement where we want to introduce extra fields into indexed lucene documents (which customized boost to assist in correct scoring) and then modify the query that gets sent to lucene to use those inserted fields.
The requirement doesn't quite fit to be included in Oak in general, but it seems that we can have extension points during indexing and querying which can hooked into to serve a custom application requirement. Following is a proposal for such extension (I've a few changes which implement a basic version... I'd be opening issue and attaching patch to it soon). The idea of the extension is very similar to custom scoring extension point we already have. For my application, we just need to hook into full text querying, so the proposal is limited to that. It can certainly be extended later - but, let's start simple :). We can have a SPI (let's call it IndexAugmentor for now) which has methods like: * String getName(); * Collection<Filed> getAugmentedFields(String path, NodeState indexedNodeState, NodeState indexDefnState); * Query getCustomQuery(String fullTextTerm, Analyzer The string returned by getName() identifies a particular implementation of SPI. An index definition can declare the augmentor implementation to be used according to this. getAugmentedFields(...) is given the nodeState being indexed (along with index def if implementation want to utilize it) and the implementation is supposed to return a collection of lucene field objects that would need to added to the document that's inserted into lucene. getCustomQuery(...) is to allow the implementation to give an extra query per full text query term. This returned query would be added to generated query for the fulltext term with a Boolean.SHOULD i.e. custom query would always give more results that were already being available (along with, of course, utilizing custom boost as inserted during index time) Currently, I've added call-back to getAugmentedFields in LuceneIndexEditory.makeDocument - which seems like the most obvious place to do it. For querying though, there are a couple of choices. I've added the callback in LucenePropertyIndex.tokenToQuery when the global full text query is being prepared (not tied to a field). As already said, I'd be opening issue/tasks and attach patches to it. I'd post those number to this thread. It'd great to have some feedback on the idea and if it makes to have such extension. Thanks, Vikas
