Amit Jain created OAK-10384: ------------------------------- Summary: Fix stripping of large indexed ordered properties Key: OAK-10384 URL: https://issues.apache.org/jira/browse/OAK-10384 Project: Jackrabbit Oak Issue Type: Bug Components: lucene Reporter: Amit Jain Assignee: Amit Jain
Currently, the ordered indexed properties are truncated at the max length supported by lucene at 32766 in the [LuceneDocumentMaker|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneDocumentMaker.java#L290-L294]. The problem is lucene uses a class {{BytesRef}} to represent strings which converts it to UTF-8. It then uses the length from this converted string to enforce the limit. The transformation between java unicode string to utf-8 can cause the length to increase for non-ascii characters. -- This message was sent by Atlassian Jira (v8.20.10#820010)