[
https://issues.apache.org/jira/browse/CALCITE-3027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18004323#comment-18004323
]
Julian Hyde edited comment on CALCITE-3027 at 7/16/25 2:43 PM:
---------------------------------------------------------------
Had been supported in CALCITE-7068, so I closed the issue.
was (Author: JIRAUSER307770):
Had been supported in https://issues.apache.org/jira/browse/CALCITE-7068, so I
closed the issue.
> Support LIKE operator in Elasticsearch
> --------------------------------------
>
> Key: CALCITE-3027
> URL: https://issues.apache.org/jira/browse/CALCITE-3027
> Project: Calcite
> Issue Type: New Feature
> Components: elasticsearch-adapter
> Reporter: yeyonghao
> Assignee: Tim Grein
> Priority: Major
> Labels: pull-request-available
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> In Elasticsearch, fuzzy matching is implemented by wildcard query:
> {code:java}
> GET /company/_search
> {
> "query": {
> "constant_score": {
> "filter": {
> "wildcard":{
> "name_text":"*Alle_"
> }
> }
> }
> }
> }
> {code}
> The symbols % and _ in sql are equivalent to the symbols * and ? in es,
> respectively.
> So I added a new QueryBuilder class to support wildcard queries:
> {code:java}
> static class WildcardQueryBuilder extends QueryBuilder {
> private final String fieldName;
> private final String value;
> WildcardQueryBuilder(String fieldName, String value) {
> this.fieldName = fieldName;
> this.value = transformValue(value);
> }
> @Override void writeJson(JsonGenerator generator) throws IOException {
> generator.writeStartObject();
> generator.writeFieldName("wildcard");
> generator.writeStartObject();
> generator.writeFieldName(fieldName);
> writeObject(generator, value);
> generator.writeEndObject();
> generator.writeEndObject();
> }
> // The symbols % and _ in sql are equivalent to the symbols * and ? in
> es, respectively.
> private String transformValue(String value) {
> if (value != null) {
> value = value
> .replaceAll("%", "*")
> .replaceAll("_", "?");
> }
> return value;
> }
> }
> {code}
> Below I give some examples to illustrate this like query.
> ----
> *Elasticsearch Version*
> {code:java}
> $ ./elasticsearch --version
> Version: 6.1.2, Build: Unknown/2019-03-19T15:31:12.734Z, JVM: 1.8.0_181
> {code}
> *Mock Data In Elasticsearch*
> {code:java}
> PUT company
> {
> "mappings": {
> "doc": {
> "properties": {
> "name_keyword":{
> "type": "keyword"
> },
> "name_text":{
> "type": "text"
> }
> }
> }
> }
> }
> PUT /company/doc/_bulk?refresh
> {"index":{}}
> {"name_keyword":"hello to Allen","name_text":"hello to Allen"}
> {"index":{}}
> {"name_keyword":"say hi william","name_text":"say hi william"}
> {"index":{}}
> {"name_keyword":"hello to Rose","name_text":"hello to Rose"}
> {code}
> *Test in sqlline*
> model.json
> {code:java}
> {
> "version": "1.0",
> "defaultSchema": "elasticsearch",
> "schemas": [
> {
> "type": "custom",
> "name": "elasticsearch",
> "factory":
> "org.apache.calcite.adapter.elasticsearch.ElasticsearchSchemaFactory",
> "operand": {
> "coordinates": "{'127.0.0.1': 9200}",
> "index": "company"
> }
> }
> ]
> }
> {code}
> sqlline test
> {code:java}
> $ ./sqlline
> sqlline version 1.7.0
> sqlline> !connect jdbc:calcite:model=model.json admin admin
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/Users/yeyonghao/maven/repository/org/slf4j/slf4j-log4j12/1.7.25/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/Users/yeyonghao/maven/repository/ch/qos/logback/logback-classic/1.1.3/logback-classic-1.1.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> Transaction isolation level TRANSACTION_REPEATABLE_READ is not supported.
> Default (TRANSACTION_NONE) will be used instead.
> 0: jdbc:calcite:model=model.json> select * from "doc";
> +---------------------------------------------------------+
> | _MAP |
> +---------------------------------------------------------+
> | {name_keyword=hello to Allen, name_text=hello to Allen} |
> | {name_keyword=hello to Rose, name_text=hello to Rose} |
> | {name_keyword=say hi william, name_text=say hi william} |
> +---------------------------------------------------------+
> 3 rows selected (1.536 seconds)
> 0: jdbc:calcite:model=model.json>
> 0: jdbc:calcite:model=model.json> select * from "doc" where
> _MAP['name_keyword'] like '%Alle_';
> +---------------------------------------------------------+
> | _MAP |
> +---------------------------------------------------------+
> | {name_keyword=hello to Allen, name_text=hello to Allen} |
> +---------------------------------------------------------+
> 1 row selected (0.092 seconds)
> 0: jdbc:calcite:model=model.json>
> 0: jdbc:calcite:model=model.json> select * from "doc" where
> _MAP['name_keyword'] like 'hello to Rose';
> +-------------------------------------------------------+
> | _MAP |
> +-------------------------------------------------------+
> | {name_keyword=hello to Rose, name_text=hello to Rose} |
> +-------------------------------------------------------+
> 1 row selected (0.063 seconds)
> 0: jdbc:calcite:model=model.json>
> 0: jdbc:calcite:model=model.json> select * from "doc" where
> _MAP['name_keyword'] like '__llo to Ros_';
> +-------------------------------------------------------+
> | _MAP |
> +-------------------------------------------------------+
> | {name_keyword=hello to Rose, name_text=hello to Rose} |
> +-------------------------------------------------------+
> 1 row selected (0.048 seconds)
> {code}
> ----
> *Attention*
> In Elasticsearch, the wildcard searches for the contents of the inverted
> index table, so the behavior is different for fields of type text and field
> type of keyword:
> 1.If the field type is keyword, es will index the entire field content, so
> the es wildcard search behavior is the same as sql's like.
> 2.If the field type is text, the content of the field will be saved into the
> index table after being segmented. So in this case, the es wildcard search is
> different from the like search in sql. In fact, at this time we recommend
> using full-text search instead of wildcard search.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)