Github user afs commented on a diff in the pull request:
https://github.com/apache/jena/pull/406#discussion_r183788841
--- Diff:
jena-text-es/src/main/java/org/apache/jena/query/text/es/TextIndexES.java ---
@@ -422,6 +422,27 @@ public EntityDefinition getDocDef() {
}
private String parse(String fieldName, String qs, String lang) {
+ //Escape special characters if any in the query string
+ qs = qs.replaceAll("\\:", "\\\\:")
+ .replaceAll("\\+", "\\\\+")
+ .replaceAll("\\-", "\\\\-")
+ .replaceAll("\\=", "\\\\=")
+ .replaceAll("\\&", "\\\\&")
+ .replaceAll("\\|", "\\\\|")
+ .replaceAll("\\>", "\\\\>")
+ .replaceAll("\\<", "\\\\<")
+ .replaceAll("\\!", "\\\\!")
+ .replaceAll("\\(", "\\\\(")
+ .replaceAll("\\)", "\\\\)")
+ .replaceAll("\\{", "\\\\{")
+ .replaceAll("\\}", "\\\\}")
+ .replaceAll("\\]", "\\\\]")
+ .replaceAll("\\[", "\\\\[")
+ .replaceAll("\\^", "\\\\^")
+ .replaceAll("\\~", "\\\\~")
+ .replaceAll("\\?", "\\\\?");
+
--- End diff --
The [Lucene escape
code|https://github.com/apache/lucene-solr/blob/master/lucene/queryparser/src/java/org/apache/lucene/queryparser/classic/QueryParserBase.java#L971].
It includes `\\` and does a simple pass over the string.
The thing to watch for with `replaceAll` is that it creates a regex every
call which can mount up even for these fixed string regexps.
---