[ 
https://issues.apache.org/jira/browse/JCR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu updated JCR-2906:
---------------------------------

    Attachment: JCR-2906.patch

Really good analysis, thanks for pointing out where the problem is!

The problem is not that the JCR spec may or may not define sorting on a 
multi-valued property. the problem is the sort behavior is not stable when 
dealing with MVPs.

Like Paul correctly pointed out, whenever there is a MVP present, the value in 
the cache gets overwritten by the last value found by the lucene Term query. So 
in fact an MVP is represented in the sort by just one of its values (which can 
apparently change at runtime - that is easily reproducible by running the 
attached test a few times).

The solution is to use the position info that comes via lucene's TermPositions. 
This does contain the term's position within the current document allowing us 
to use it as an index for MVPs.
The downside is that the Comparables have to support arrays as well as simple 
values, so I've added a class (ComparableArray) that simply delegates compareTo 
calls to the inner array of Comparables. This way all the sql languages 
(xpath&sql&sql2) have similar sort for MVPs.



Attaching patch.

                
> Multivalued property sorted by last/random value
> ------------------------------------------------
>
>                 Key: JCR-2906
>                 URL: https://issues.apache.org/jira/browse/JCR-2906
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 2.2
>         Environment: Windows 7, Sun JDK 1.6.0_23
>            Reporter: Paul Lysak
>              Labels: multivalued, sort
>         Attachments: JCR-2906.patch
>
>
> Sorting on multivalued property may produce incorrect result because sorting 
> is performed only by last value of multivalued property.
> Steps to reproduce:
> 1. Create multivalued field in repository. Example from nodetypes file:
> <propertyDefinition name="MyProperty" requiredType="String" 
> autoCreated="false" mandatory="false"
>    onParentVersion="COPY" protected="false" multiple="false">
> 2. Create few records so that all records except one would contain single 
> value for MyProperty and one record would contain 
> first value which is greater then of any other record and the second value is 
> somewhere in the middle. Here is an example:
> 1st record: "aaaa"
> 2nd record: "cccc"
> 3rd record: "dddd", "bbbb"
> 3. Run some query which sorts Example of XPath query:
> //*[...here are some criteria...] order by @MyProperty ascending
> The query would return documents in such order:
> "aaaa"
> "dddd", "bbbb"
> "cccc"
> which is not expected order (expected same order as they were entered - as 
> "aaaa" < "cccc", "cccc" < "dddd")
> After some digging I found out that it happens because method 
> org.apache.jackrabbit.core.query.lucene.SharedFieldCache.getValueIndex
> (called from 
> org.apache.jackrabbit.core.query.lucene.SharedFieldSortComparator.SimpleScoreDocComparator
>  constructor)
> returns only last Comparable of the document. Here is overwrites previous 
> value:
> retArray[termDocs.doc()] = getValue(value, type);
> I tried to concatenate comparables (just to check if it would work for my 
> case):
>       if(retArray[termDocs.doc()] == null) {
>               retArray[termDocs.doc()] = getValue(value, type);
>       } else {
>               retArray[termDocs.doc()] =
>                               retArray[termDocs.doc()] + " " + 
> getValue(value, type);
>       }
> But it didn't worked well either - TermEnum returns terms not in the same 
> order as JackRabbit returns values of multivalued field
> (as an example ["qwer", "asdf"] may become ["asdf", "qwer"] ). So, simple 
> concatenation doesn't help.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to