[
https://issues.apache.org/jira/browse/OAK-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059566#comment-15059566
]
Chetan Mehrotra commented on OAK-3576:
--------------------------------------
[~catholicon] Missed reviewing this new feature due to other on going work. So
some late comments before we consider this as fixed
*SPI Interface*
{code}
public interface IndexFieldProvider {
Iterable<Field> getAugmentedFields(final String path, final String
propertyName,
final NodeState document, final
PropertyState property,
final NodeState indexDefinition);
}
{code}
I think going per property is too fine grained. Instead it would be better to
just pass in the NodeState of the root node which was found to be updated. So
instead of that this interface should be like
{code}
/**
* Callback to add custom fields for given node
*
* @param path path of the document being indexed
* @param node node being indexed
* @param indexDefinition nodeState of the index definition node
*
* @return fields which are to be added as part of {@link
org.apache.lucene.document.Document} being prepared
* for node being indexed
*/
Iterable<Field> getAugmentedFields(String path, NodeState node, NodeState
indexDefinition);
Set<String> getSupportedTypes();
{code}
*How callbacks are selected*
bq. All registered augmentors would get callbacks. It's upto the implementation
to decide to quit early.
Thinking more about it now it would implementation bit tricky and each
implementation would need to figure out how to opt out. So would be better to
add a method {{supportedTypes}} where the implementation can provide a list of
supported nodeTypes for which it wishes to participate. This aligns with the
nodeType based approach used by indexing logic currently. Same rule should be
applicable on the query side
This would impact other part of current design. For keeping things fast it
would better to track this providers against the matching {{IndexRule}}
*IndexAugmentorFactory*
We can do away with that and instead provide default noop impls as part of
interface itself. The tracking of multiple implementations can be simplified
with use of {{Tracker}}. See {{ConsolidatedCacheStats}} for an example. Keep
stuff consistent with matching {{IndexRule}} and OSGi dynamics would add some
complication. I can take care of this part
*Misc*
* Move the SPI package to
{{org.apache.jackrabbit.oak.plugins.index.lucene.spi.fieldprovider}} and add
{{ConsumerType}} annotation. Also add package-info with initial version. This
would ensure that addition of any new class or change in one othe SPI class
does not require minor version update.
* Minor nit pick - Avoid reordering of imports in commit. It causes issue in
later merge to branches!
> Allow custom extension to augment indexed lucene documents
> ----------------------------------------------------------
>
> Key: OAK-3576
> URL: https://issues.apache.org/jira/browse/OAK-3576
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: lucene
> Reporter: Vikas Saurabh
> Assignee: Vikas Saurabh
> Fix For: 1.3.13
>
> Attachments: OAK-3576.jsedding.patch, OAK-3576.wip.patch
>
>
> Following up on http://oak.markmail.org/thread/a53ahsgb3bowtwyq, we should
> have an extension point in oak to allow custom code to add fields to
> documents getting indexed in lucene. We'd also need to allow extension point
> to add extra query terms to utilize such augmented fields.
> (cc [~teofili], [~chetanm])
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)