nfsantos commented on code in PR #2180:
URL: https://github.com/apache/jackrabbit-oak/pull/2180#discussion_r1995023966
##########
oak-search-elastic/src/main/java/org/apache/jackrabbit/oak/plugins/index/elastic/query/ElasticRequestHandler.java:
##########
@@ -908,6 +914,66 @@ private static QueryStringQuery.Builder
fullTextQuery(String text, String fieldN
return qsqBuilder.fields(fieldName);
}
+ private String rewriteQueryText(String text) {
+ String rewritten = FulltextIndex.rewriteQueryText(text);
+
+ // here we handle special cases where the syntax used in the lucene
4.x query parser is not supported by the current version
+ if (rewritten.contains("~")) {
+ rewritten = convertFuzzyQuery(rewritten);
+ }
+
+ return rewritten;
+ }
+
+ /**
+ * Converts Lucene fuzzy queries from the old syntax (float similarity) to
the new syntax (edit distance).
+ * <p>
+ * In Lucene 4, fuzzy queries were specified using a floating-point
similarity (e.g., "term~0.8"), where values
+ * closer to 1 required a higher similarity match. In later Lucene
versions, this was replaced with a discrete
+ * edit distance (0, 1, or 2).
+ * <p>
+ * This method:
+ * <ul>
+ * <li>Detects and converts old fuzzy queries (e.g., "roam~0.7" →
"roam~1").</li>
+ * <li>Preserves new fuzzy queries (e.g., "test~2" remains
unchanged).</li>
+ * <li>Avoids modifying proximity queries (e.g., "\"quick fox\"~5"
remains unchanged).</li>
+ * </ul>
+ *
+ * @param text The input query string containing fuzzy or proximity
queries.
+ * @return A query string where old fuzzy syntax is converted to the new
format.
+ */
+ private String convertFuzzyQuery(String text) {
+ Matcher oldMatcher = OLD_FUZZY_PATTERN.matcher(text);
+ StringBuilder result = new StringBuilder();
+
+ while (oldMatcher.find()) {
Review Comment:
If the query does not contain a fuzzy expression (maybe the `~` in the query
was quoted and is not part of a fuzzy expression or someone in the future
removed the check for `~` done before calling this method), this method will
always create a copy of the argument. We could instead return the argument
unchanged and avoid creating a new copy if it does not match a fuzzy query.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]