[
https://issues.apache.org/jira/browse/OAK-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808720#comment-16808720
]
Thomas Mueller edited comment on OAK-8166 at 4/4/19 2:29 PM:
-------------------------------------------------------------
[~tmueller] , [~catholicon] - the field names are not colliding here . Refer
below values of documents with and without issue .
{noformat}
*// with issue*
Document<stored,indexed,omitNorms,indexOptions=DOCS_ONLY<:path:/test>
docValueType=SORTED<:*dvjcr:content/n0/testOak*:[74 65 73 74]>
docValueType=SORTED<:*dvjcr:content/n0/testOak*:[74 65 73 74]>
docValueType=SORTED<:*dvfunction*upper*@jcr:content/n0/testOak*:[54 45 53 54]>
indexed,omitNorms,indexOptions=DOCS_ONLY<*function*upper*@jcr:content/n0/testOak*:TEST>>
*//without issuev (not with the fix but Removed ordered from prop def with
function*
Document<stored,indexed,omitNorms,indexOptions=DOCS_ONLY<:path:/test>
docValueType=SORTED<:*dvjcr:content/n0/testOak*:[74 65 73 74]>
indexed,omitNorms,indexOptions=DOCS_ONLY<*function*upper@jcr:content/n0/testOak*:TEST*>>
{noformat}
So The problem is not with the field names for function and non functions
instances of the property def being same .
As I mentioned in my last comment - the problem is that in case of the current
scenario - the field dvjcr:content/n0/testOak gets added twice because of the
flow that I described .
It doesn't impact non-relative properties because
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/IndexDefinition.java#L1179#L1198]
- in case of non relative properties - the propAggregate list here will be
empty (because of checks at line 1183 and line 1197) - and because of which the
matchers list will be empty and the fields would finally be added to the doc
via this code -
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/editor/FulltextDocumentMaker.java#L138]
(for field_name(ordered(name))) and
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/editor/FulltextDocumentMaker.java#L150]
for field_name(ordered(function(name)))
In case of relative properties - the code block at
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/editor/FulltextDocumentMaker.java#L138]
doesn;t comes into play . and
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/editor/FulltextDocumentMaker.java#L146]
- this executes the flow wherein onResult is called twice due to reasons
mentioned above and adds field_name(ordered(name))) twice .
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/editor/FulltextDocumentMaker.java#L150]
adds field_name(ordered(function(name))) as usual and as expected .
I am not sure if I clarified things or made it more confusing . Maybe we can
discuss it on call tomorrow
was (Author: nitigup):
[~tmueller] , [~catholicon] - the field names are not colliding here . Refer
below values of documents with and without issue .
*// with issue*
Document<stored,indexed,omitNorms,indexOptions=DOCS_ONLY<:path:/test>
docValueType=SORTED<:*dvjcr:content/n0/testOak*:[74 65 73 74]>
docValueType=SORTED<:*dvjcr:content/n0/testOak*:[74 65 73 74]>
docValueType=SORTED<:*dvfunction*upper*@jcr:content/n0/testOak*:[54 45 53 54]>
indexed,omitNorms,indexOptions=DOCS_ONLY<*function*upper*@jcr:content/n0/testOak*:TEST>>
*//without issuev (not with the fix but Removed ordered from prop def with
function*
Document<stored,indexed,omitNorms,indexOptions=DOCS_ONLY<:path:/test>
docValueType=SORTED<:*dvjcr:content/n0/testOak*:[74 65 73 74]>
indexed,omitNorms,indexOptions=DOCS_ONLY<*function*upper@jcr:content/n0/testOak*:TEST*>>
So The problem is not with the field names for function and non functions
instances of the property def being same .
As I mentioned in my last comment - the problem is that in case of the current
scenario - the field dvjcr:content/n0/testOak gets added twice because of the
flow that I described .
It doesn't impact non-relative properties because
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/IndexDefinition.java#L1179#L1198]
- in case of non relative properties - the propAggregate list here will be
empty (because of checks at line 1183 and line 1197) - and because of which the
matchers list will be empty and the fields would finally be added to the doc
via this code -
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/editor/FulltextDocumentMaker.java#L138]
(for field_name(ordered(name))) and
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/editor/FulltextDocumentMaker.java#L150]
for field_name(ordered(function(name)))
In case of relative properties - the code block at
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/editor/FulltextDocumentMaker.java#L138]
doesn;t comes into play . and
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/editor/FulltextDocumentMaker.java#L146]
- this executes the flow wherein onResult is called twice due to reasons
mentioned above and adds field_name(ordered(name))) twice .
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/editor/FulltextDocumentMaker.java#L150]
adds field_name(ordered(function(name))) as usual and as expected .
I am not sure if I clarified things or made it more confusing . Maybe we can
discuss it on call tomorrow
> Index definition with orderable property definitions with and without
> functions breaks index
> --------------------------------------------------------------------------------------------
>
> Key: OAK-8166
> URL: https://issues.apache.org/jira/browse/OAK-8166
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: indexing
> Affects Versions: 1.8.12
> Reporter: Tom Blackford
> Priority: Major
> Attachments: OAK-8166_1.patch
>
>
> If an index definition contains the same orderable property with and without
> functions, it will fail to index any node which contains that property. The
> failure will be logged as [1].
> Steps to reproduce:
> * Configure index with the two property definitions shown at [2].
> * Refresh the index definition
> * Modify a node that falls under the definition - it will fail with the
> exception shown at [1]
> * Modify the 'non-function' index definition to not be orderable
> (orderable=false)
> * Refresh the index definition
> * Modify the same node - note there is no exception.
> Thanks to [~catholicon] for assistance identifying root cause.
> [1]
> {code}
> 25.03.2019 15:39:04.135 *WARN* [async-index-update-async]
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor Failed to
> index the node [/content/dam/Unknown-2.png]
> java.lang.IllegalArgumentException: DocValuesField
> ":dvjcr:content/metadata/dc:title" appears more than once in this document
> (only one value is allowed per field)
> at
> org.apache.lucene.index.SortedDocValuesWriter.addValue(SortedDocValuesWriter.java:62)
> [org.apache.jackrabbit.oak-lucene:1.8.9]
> at
> org.apache.lucene.index.DocValuesProcessor.addSortedField(DocValuesProcessor.java:125)
> [org.apache.jackrabbit.oak-lucene:1.8.9]
> at
> org.apache.lucene.index.DocValuesProcessor.addField(DocValuesProcessor.java:59)
> [org.apache.jackrabbit.oak-lucene:1.8.9]
> at
> org.apache.lucene.index.TwoStoredFieldsConsumers.addField(TwoStoredFieldsConsumers.java:36)
> [org.apache.jackrabbit.oak-lucene:1.8.9]
> at
> org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:236)
> [org.apache.jackrabbit.oak-lucene:1.8.9]
> at
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
> [org.apache.jackrabbit.oak-lucene:1.8.9]
> at
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:455)
> [org.apache.jackrabbit.oak-lucene:1.8.9]
> at
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1534)
> [org.apache.jackrabbit.oak-lucene:1.8.9]
> at
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1507)
> [org.apache.jackrabbit.oak-lucene:1.8.9]
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.writer.DefaultIndexWriter.updateDocument(DefaultIndexWriter.java:86)
> [org.apache.jackrabbit.oak-lucene:1.8.9]
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.addOrUpdate(LuceneIndexEditor.java:258)
> [org.apache.jackrabbit.oak-lucene:1.8.9]
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:140)
> [org.apache.jackrabbit.oak-lucene:1.8.9]
> at
> org.apache.jackrabbit.oak.spi.commit.CompositeEditor.leave(CompositeEditor.java:74)
> [org.apache.jackrabbit.oak-store-spi:1.8.9]
> {code}
> [2]
> {code}
> "dcTitle": {
> "jcr:primaryType": "nt:unstructured",
> "nodeScopeIndex": "true",
> "useInSuggest": "true",
> "ordered": "true",
> "propertyIndex": "true",
> "useInSpellcheck": "true",
> "name": "jcr:content/metadata/dc:title",
> "boost": "2.0"
> },
> "dcTitleLowercase": {
> "jcr:primaryType": "nt:unstructured",
> "ordered": "true",
> "propertyIndex": "true",
> "function": "fn:lower-case(jcr:content/metadata/@dc:title)"
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)