nfsantos commented on code in PR #649:
URL: https://github.com/apache/jackrabbit-oak/pull/649#discussion_r951598922
##########
oak-search-elastic/src/main/java/org/apache/jackrabbit/oak/plugins/index/elastic/ElasticIndexDefinition.java:
##########
@@ -220,6 +222,35 @@ public String getElasticKeyword(String propertyName) {
return field;
}
+ public String getElasticTextField(String propertyName) {
+ List<PropertyDefinition> propertyDefinitions =
propertiesByName.get(propertyName);
+ if (propertyDefinitions == null) {
+ return propertyName;
+ }
+
+ Type<?> type = null;
+ for (PropertyDefinition pd : propertyDefinitions) {
+ type = Type.fromTag(pd.getType(), false);
+ }
+
+ if (isAnalyzed(propertyDefinitions)) {
+ // The full text index for String properties is <propertyName>,
while for non-string properties is part of
+ // the multi-field and is called <propertyName>.text
+ if (Type.BINARY.equals(type) ||
Review Comment:
In Elastic, Binary is a Base64 encoded string, which by default is not
indexed.
https://www.elastic.co/guide/en/elasticsearch/reference/7.17/binary.html
I have not changed the way the JCR binary type is mapped, even before this
PR, the mapping logic handles Binary, Long, Double, Date and Boolean all the in
the same way, and text in a different way. But Binary does seem to require
special treatment, as there are no good use cases for it to be indexed. That
is, are there any situations where it may make sense to run either term-level
or full-text queries on Binary fields?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]