Amit Jain created OAK-10384:
-------------------------------

             Summary: Fix stripping of large indexed ordered properties
                 Key: OAK-10384
                 URL: https://issues.apache.org/jira/browse/OAK-10384
             Project: Jackrabbit Oak
          Issue Type: Bug
          Components: lucene
            Reporter: Amit Jain
            Assignee: Amit Jain


Currently, the ordered indexed properties are truncated at the max length 
supported by lucene at 32766 in the 
[LuceneDocumentMaker|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneDocumentMaker.java#L290-L294].

The problem is lucene uses a class {{BytesRef}} to represent strings which 
converts it to UTF-8. It then uses the length from this converted string to 
enforce the limit. The transformation between java unicode string to utf-8 can 
cause the length to increase for non-ascii characters.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to