This is an automated email from the ASF dual-hosted git repository.

victoria pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
     new e57f880020 document new filters and stuff (#14760)
e57f880020 is described below

commit e57f8800206ec34c0b21382eec65b620a1885e53
Author: Clint Wylie <[email protected]>
AuthorDate: Tue Aug 8 16:01:06 2023 -0700

    document new filters and stuff (#14760)
---
 docs/ingestion/ingestion-spec.md   |   2 +-
 docs/ingestion/schema-design.md    |   2 +
 docs/querying/filters.md           | 738 ++++++++++++++++++++++++++-----------
 docs/querying/sql-query-context.md |   1 +
 4 files changed, 523 insertions(+), 220 deletions(-)

diff --git a/docs/ingestion/ingestion-spec.md b/docs/ingestion/ingestion-spec.md
index 43baf601bd..6ed2f0a838 100644
--- a/docs/ingestion/ingestion-spec.md
+++ b/docs/ingestion/ingestion-spec.md
@@ -186,7 +186,7 @@ Treat `__time` as a millisecond timestamp: the number of 
milliseconds since Jan
 ### `dimensionsSpec`
 
 The `dimensionsSpec` is located in `dataSchema` → `dimensionsSpec` and is 
responsible for
-configuring [dimensions](./schema-model.md#dimensions). An example 
`dimensionsSpec` is:
+configuring [dimensions](./schema-model.md#dimensions).
 
 You can either manually specify the dimensions or take advantage of schema 
auto-discovery where you allow Druid to infer all or some of the schema for 
your data. This means that you don't have to explicitly specify your dimensions 
and their type. 
 
diff --git a/docs/ingestion/schema-design.md b/docs/ingestion/schema-design.md
index 060a9dc7ab..7fd29c1d0e 100644
--- a/docs/ingestion/schema-design.md
+++ b/docs/ingestion/schema-design.md
@@ -261,6 +261,8 @@ native boolean types, Druid ingests these values as strings 
if `druid.expression
 the [array functions](../querying/sql-array-functions.md) or 
[UNNEST](../querying/sql-functions.md#unnest). Nested
 columns can be queried with the [JSON 
functions](../querying/sql-json-functions.md).
 
+We also highly recommend setting `druid.generic.useDefaultValueForNull=false` 
when using these columns since it also enables out of the box `ARRAY` type 
filtering. If not set to `false`, setting `sqlUseBoundsAndSelectors` to `false` 
on the [SQL query context](../querying/sql-query-context.md) can enable `ARRAY` 
filtering instead.
+
 Mixed type columns are stored in the _least_ restrictive type that can 
represent all values in the column. For example:
 
 - Mixed numeric columns are `DOUBLE`
diff --git a/docs/querying/filters.md b/docs/querying/filters.md
index 82fdb81168..74ae3f406a 100644
--- a/docs/querying/filters.md
+++ b/docs/querying/filters.md
@@ -35,263 +35,221 @@ Apache Druid supports the following types of filters.
 
 ## Selector filter
 
-The simplest filter is a selector filter. The selector filter will match a 
specific dimension with a specific value. Selector filters can be used as the 
base filters for more complex Boolean expressions of filters.
+The simplest filter is a selector filter. The selector filter matches a 
specific dimension with a specific value. Selector filters can be used as the 
base filters for more complex Boolean expressions of filters.
 
-The grammar for a SELECTOR filter is as follows:
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "selector".| Yes |
+| `dimension` | Input column or virtual column name to filter. | Yes |
+| `value` | String value to match. | No. If not specified the filter matches 
NULL values. |
+| `extractionFn` | [Extraction 
function](./dimensionspecs.md#extraction-functions) to apply to `dimension` 
prior to value matching. See [filtering with extraction 
functions](#filtering-with-extraction-functions) for details. | No |
 
-``` json
-"filter": { "type": "selector", "dimension": <dimension_string>, "value": 
<dimension_value_string> }
-```
-
-This is the equivalent of `WHERE <dimension_string> = 
'<dimension_value_string>'` or `WHERE <dimension_string> IS NULL`
-(if the `value` is `null`).
+The selector filter can only match against `STRING` (single and multi-valued), 
`LONG`, `FLOAT`, `DOUBLE` types. Use the newer null and equality filters to 
match against `ARRAY` or `COMPLEX` types.
 
-The selector filter supports the use of extraction functions, see [Filtering 
with Extraction Functions](#filtering-with-extraction-functions) for details.
+When the selector filter matches against numeric inputs, the string `value` 
will be best-effort coerced into a numeric value.
 
-## Column comparison filter
-
-The column comparison filter is similar to the selector filter, but instead 
compares dimensions to each other. For example:
+### Example: equivalent of `WHERE someColumn = 'hello'`
 
 ``` json
-"filter": { "type": "columnComparison", "dimensions": [<dimension_a>, 
<dimension_b>] }
+{ "type": "selector", "dimension": "someColumn", "value": "hello" }
 ```
 
-This is the equivalent of `WHERE <dimension_a> = <dimension_b>`.
 
-`dimensions` is list of [DimensionSpecs](./dimensionspecs.md), making it 
possible to apply an extraction function if needed.
-
-## Regular expression filter
-
-The regular expression filter is similar to the selector filter, but using 
regular expressions. It matches the specified dimension with the given pattern. 
The pattern can be any standard [Java regular 
expression](http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html).
+### Example: equivalent of `WHERE someColumn IS NULL`
 
 ``` json
-"filter": { "type": "regex", "dimension": <dimension_string>, "pattern": 
<pattern_string> }
+{ "type": "selector", "dimension": "someColumn", "value": null }
 ```
 
-The regex filter supports the use of extraction functions, see [Filtering with 
Extraction Functions](#filtering-with-extraction-functions) for details.
-
-
-## Logical expression filters
-
-### AND
 
-The grammar for an AND filter is as follows:
+## Equality Filter
 
-``` json
-"filter": { "type": "and", "fields": [<filter>, <filter>, ...] }
-```
+The equality filter is a replacement for the selector filter with the ability 
to match against any type of column. The equality filter is designed to have 
more SQL compatible behavior than the selector filter and so can not match null 
values. To match null values use the null filter.
 
-The filters in fields can be any other filter defined on this page.
+Druid's SQL planner uses the equality filter by default instead of selector 
filter whenever `druid.generic.useDefaultValueForNull=false`, or if 
`sqlUseBoundAndSelectors` is set to false on the [SQL query 
context](./sql-query-context.md).
 
-### OR
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "equality".| Yes |
+| `column` | Input column or virtual column name to filter. | Yes |
+| `matchValueType` | String specifying the type of value to match. For example 
`STRING`, `LONG`, `DOUBLE`, `FLOAT`, `ARRAY<STRING>`, `ARRAY<LONG>`, or any 
other Druid type. The `matchValueType` determines how Druid interprets the 
`matchValue` to assist in converting to the type of the matched `column`. | Yes 
|
+| `matchValue` | Value to match, must not be null. | Yes |
 
-The grammar for an OR filter is as follows:
+### Example: equivalent of `WHERE someColumn = 'hello'`
 
-``` json
-"filter": { "type": "or", "fields": [<filter>, <filter>, ...] }
+```json
+{ "type": "equality", "column": "someColumn", "matchValueType": "STRING", 
"matchValue": "hello" }
 ```
 
-The filters in fields can be any other filter defined on this page.
+### Example: equivalent of `WHERE someNumericColumn = 1.23`
 
-### NOT
+```json
+{ "type": "equality", "column": "someNumericColumn", "matchValueType": 
"DOUBLE", "matchValue": 1.23 }
+```
 
-The grammar for a NOT filter is as follows:
+### Example: equivalent of `WHERE someArrayColumn = ARRAY[1, 2, 3]`
 
 ```json
-"filter": { "type": "not", "field": <filter> }
+{ "type": "equality", "column": "someArrayColumn", "matchValueType": 
"ARRAY<LONG>", "matchValue": [1, 2, 3] }
 ```
 
-The filter specified at field can be any other filter defined on this page.
 
-## JavaScript filter
+## Null Filter
 
-The JavaScript filter matches a dimension against the specified JavaScript 
function predicate. The filter matches values for which the function returns 
true.
+The null filter is a partial replacement for the selector filter. It is 
dedicated to matching NULL values.
 
-The function takes a single argument, the dimension value, and returns either 
true or false.
+Druid's SQL planner uses the null filter by default instead of selector filter 
whenever `druid.generic.useDefaultValueForNull=false`, or if 
`sqlUseBoundAndSelectors` is set to false on the [SQL query 
context](./sql-query-context.md).
 
-```json
-{
-  "type" : "javascript",
-  "dimension" : <dimension_string>,
-  "function" : "function(value) { <...> }"
-}
-```
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "null".| Yes |
+| `column` | Input column or virtual column name to filter. | Yes |
 
-**Example**
-The following matches any dimension values for the dimension `name` between 
`'bar'` and `'foo'`
+### Example: equivalent of `WHERE someColumn IS NULL`
 
 ```json
-{
-  "type" : "javascript",
-  "dimension" : "name",
-  "function" : "function(x) { return(x >= 'bar' && x <= 'foo') }"
-}
+{ "type": "null", "column": "someColumn" }
 ```
 
-The JavaScript filter supports the use of extraction functions, see [Filtering 
with Extraction Functions](#filtering-with-extraction-functions) for details.
-
-> JavaScript-based functionality is disabled by default. Please refer to the 
Druid [JavaScript programming guide](../development/javascript.md) for 
guidelines about using Druid's JavaScript functionality, including instructions 
on how to enable it.
 
-## Extraction filter
-
-> The extraction filter is now deprecated. The selector filter with an 
extraction function specified
-> provides identical functionality and should be used instead.
+## Column comparison filter
 
-Extraction filter matches a dimension using some specific [Extraction 
function](./dimensionspecs.md#extraction-functions).
-The following filter matches the values for which the extraction function has 
transformation entry `input_key=output_value` where
-`output_value` is equal to the filter `value` and `input_key` is present as 
dimension.
+The column comparison filter is similar to the selector filter, but compares 
dimensions to each other. For example:
 
-**Example**
-The following matches dimension values in `[product_1, product_3, product_5]` 
for the column `product`
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "selector".| Yes |
+| `dimensions` | List of [`DimensionSpec`](./dimensionspecs.md) to compare. | 
Yes |
 
-```json
-{
-    "filter": {
-        "type": "extraction",
-        "dimension": "product",
-        "value": "bar_1",
-        "extractionFn": {
-            "type": "lookup",
-            "lookup": {
-                "type": "map",
-                "map": {
-                    "product_1": "bar_1",
-                    "product_5": "bar_1",
-                    "product_3": "bar_1"
-                }
-            }
-        }
-    }
-}
-```
+`dimensions` is list of [DimensionSpecs](./dimensionspecs.md), making it 
possible to apply an extraction function if needed.
 
-## Search filter
+Note that the column comparison filter converts all values to strings prior to 
comparison. This allows differently-typed input columns to match without a cast 
operation.
 
-Search filters can be used to filter on partial string matches.
+### Example: equivalent of `WHERE someColumn = someLongColumn`
 
-```json
+``` json
 {
-    "filter": {
-        "type": "search",
-        "dimension": "product",
-        "query": {
-          "type": "insensitive_contains",
-          "value": "foo"
-        }
+  "type": "columnComparison",
+  "dimensions": [
+    "someColumn",
+    {
+      "type" : "default",
+      "dimension" : someLongColumn,
+      "outputType": "LONG"
     }
+  ]
 }
 ```
 
-|property|description|required?|
-|--------|-----------|---------|
-|type|This String should always be "search".|yes|
-|dimension|The dimension to perform the search over.|yes|
-|query|A JSON object for the type of search. See [search query 
spec](#search-query-spec) for more information.|yes|
-|extractionFn|[Extraction function](#filtering-with-extraction-functions) to 
apply to the dimension|no|
 
-The search filter supports the use of extraction functions, see [Filtering 
with Extraction Functions](#filtering-with-extraction-functions) for details.
+## Logical expression filters
 
-### Search query spec
+### AND
 
-#### Contains
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "and".| Yes |
+| `fields` | List of filter JSON objects, such as any other filter defined on 
this page or provided by extensions. | Yes |
 
-|property|description|required?|
-|--------|-----------|---------|
-|type|This String should always be "contains".|yes|
-|value|A String value to run the search over.|yes|
-|caseSensitive|Whether two string should be compared as case sensitive or 
not|no (default == false)|
 
-#### Insensitive Contains
+#### Example: equivalent of `WHERE someColumn = 'a' AND otherColumn = 1234 AND 
anotherColumn IS NULL`
 
-|property|description|required?|
-|--------|-----------|---------|
-|type|This String should always be "insensitive_contains".|yes|
-|value|A String value to run the search over.|yes|
+``` json
+{
+  "type": "and",
+  "fields": [
+    { "type": "equality", "column": "someColumn", "matchValue": "a", 
"matchValueType": "STRING" },
+    { "type": "equality", "column": "otherColumn", "matchValue": 1234, 
"matchValueType": "LONG" },
+    { "type": "null", "column": "anotherColumn" } 
+  ]
+}
+```
 
-Note that an "insensitive_contains" search is equivalent to a "contains" 
search with "caseSensitive": false (or not
-provided).
+### OR
 
-#### Fragment
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "or".| Yes |
+| `fields` | List of filter JSON objects, such as any other filter defined on 
this page or provided by extensions. | Yes |
 
-|property|description|required?|
-|--------|-----------|---------|
-|type|This String should always be "fragment".|yes|
-|values|A JSON array of String values to run the search over.|yes|
-|caseSensitive|Whether strings should be compared as case sensitive or not. 
Default: false(insensitive)|no|
+#### Example: equivalent of `WHERE someColumn = 'a' OR otherColumn = 1234 OR 
anotherColumn IS NULL`
 
-## In filter
+``` json
+{
+  "type": "or",
+  "fields": [
+    { "type": "equality", "column": "someColumn", "matchValue": "a", 
"matchValueType": "STRING" },
+    { "type": "equality", "column": "otherColumn", "matchValue": 1234, 
"matchValueType": "LONG" },
+    { "type": "null", "column": "anotherColumn" } 
+  ]
+}
+```
 
-In filter can be used to express the following SQL query:
+### NOT
 
-```sql
- SELECT COUNT(*) AS 'Count' FROM `table` WHERE `outlaw` IN ('Good', 'Bad', 
'Ugly')
-```
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "not".| Yes |
+| `field` | Filter JSON objects, such as any other filter defined on this page 
or provided by extensions. | Yes |
 
-The grammar for a "in" filter is as follows:
+#### Example: equivalent of `WHERE someColumn IS NOT NULL`
 
 ```json
-{
-    "type": "in",
-    "dimension": "outlaw",
-    "values": ["Good", "Bad", "Ugly"]
-}
+{ "type": "not", "field": { "type": "null", "column": "someColumn" }}
 ```
 
-The "in" filter supports the use of extraction functions, see [Filtering with 
Extraction Functions](#filtering-with-extraction-functions) for details.
-
-If an empty `values` array is passed to the "in" filter, it will simply return 
an empty result.
 
-If the `dimension` is a multi-valued dimension, the "in" filter will return 
true if one of the dimension values is
-in the `values` array.
-
-If the `values` array contains `null`, the "in" filter matches null values. 
This differs from the SQL IN filter, which
-does not match NULL values.
+## In filter
+The in filter can match input rows against a set of values, where a match 
occurs if the value is contained in the set.
 
-## Like filter
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "in".| Yes |
+| `dimension` | Input column or virtual column name to filter. | Yes |
+| `values` | List of string value to match. | Yes |
+| `extractionFn` | [Extraction 
function](./dimensionspecs.md#extraction-functions) to apply to `dimension` 
prior to value matching. See [filtering with extraction 
functions](#filtering-with-extraction-functions) for details. | No |
 
-Like filters can be used for basic wildcard searches. They are equivalent to 
the SQL LIKE operator. Special characters
-supported are "%" (matches any number of characters) and "\_" (matches any one 
character).
 
-|property|type|description|required?|
-|--------|-----------|---------|---------|
-|type|String|This should always be "like".|yes|
-|dimension|String|The dimension to filter on|yes|
-|pattern|String|LIKE pattern, such as "foo%" or "___bar".|yes|
-|escape|String|An escape character that can be used to escape special 
characters.|no|
-|extractionFn|[Extraction function](#filtering-with-extraction-functions)| 
Extraction function to apply to the dimension|no|
+If an empty `values` array is passed to the "in" filter, it will simply return 
an empty result.
 
-Like filters support the use of extraction functions, see [Filtering with 
Extraction Functions](#filtering-with-extraction-functions) for details.
+If the `values` array contains `null`, the "in" filter matches null values. 
This differs from the SQL IN filter, which
+does not match NULL values.
 
-This Like filter expresses the condition `last_name LIKE "D%"` (i.e. last_name 
starts with "D").
+### Example: equivalent of `WHERE `outlaw` IN ('Good', 'Bad', 'Ugly')`
 
 ```json
 {
-    "type": "like",
-    "dimension": "last_name",
-    "pattern": "D%"
+    "type": "in",
+    "dimension": "outlaw",
+    "values": ["Good", "Bad", "Ugly"]
 }
 ```
 
+
 ## Bound filter
 
 Bound filters can be used to filter on ranges of dimension values. It can be 
used for comparison filtering like
 greater than, less than, greater than or equal to, less than or equal to, and 
"between" (if both "lower" and
 "upper" are set).
 
-|property|type|description|required?|
-|--------|-----------|---------|---------|
-|type|String|This should always be "bound".|yes|
-|dimension|String|The dimension to filter on|yes|
-|lower|String|The lower bound for the filter|no|
-|upper|String|The upper bound for the filter|no|
-|lowerStrict|Boolean|Perform strict comparison on the lower bound (">" instead 
of ">=")|no, default: false|
-|upperStrict|Boolean|Perform strict comparison on the upper bound ("<" instead 
of "<=")|no, default: false|
-|ordering|String|Specifies the sorting order to use when comparing values 
against the bound. Can be one of the following values: "lexicographic", 
"alphanumeric", "numeric", "strlen", "version". See [Sorting 
Orders](./sorting-orders.md) for more details.|no, default: "lexicographic"|
-|extractionFn|[Extraction function](#filtering-with-extraction-functions)| 
Extraction function to apply to the dimension|no|
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "bound". | Yes |
+| `dimension` | Input column or virtual column name to filter. | Yes |
+| `lower` | The lower bound string match value for the filter. | No |
+| `upper`| The upper bound string match value for the filter. | No |
+| `lowerStrict` | Boolean indicating whether to perform strict comparison on 
the `lower` bound (">" instead of ">="). | No, default: `false` |
+| `upperStrict` | Boolean indicating whether to perform strict comparison on 
the upper bound ("<" instead of "<="). | No, default: `false`|
+| `ordering` | String that specifies the sorting order to use when comparing 
values against the bound. Can be one of the following values: 
`"lexicographic"`, `"alphanumeric"`, `"numeric"`, `"strlen"`, `"version"`. See 
[Sorting Orders](./sorting-orders.md) for more details. | No, default: 
`"lexicographic"`|
+| `extractionFn` | [Extraction 
function](./dimensionspecs.md#extraction-functions) to apply to `dimension` 
prior to value matching. See [filtering with extraction 
functions](#filtering-with-extraction-functions) for details. | No |
+
+When the bound filter matches against numeric inputs, the string `lower` and 
`upper` bound values are best-effort coerced into a numeric value when using 
the `"numeric"` mode of ordering.
 
-Bound filters support the use of extraction functions, see [Filtering with 
Extraction Functions](#filtering-with-extraction-functions) for details.
+The bound filter can only match against `STRING` (single and multi-valued), 
`LONG`, `FLOAT`, `DOUBLE` types. Use the newer range to match against `ARRAY` 
or `COMPLEX` types.
 
-The following bound filter expresses the condition `21 <= age <= 31`:
+Note that the bound filter matches null values if you don't specify a lower 
bound. Use the range filter if SQL-compatible behavior.
+
+### Example: equivalent to `WHERE 21 <= age <= 31`
 
 ```json
 {
@@ -303,7 +261,7 @@ The following bound filter expresses the condition `21 <= 
age <= 31`:
 }
 ```
 
-This filter expresses the condition `foo <= name <= hoo`, using the default 
lexicographic sorting order.
+### Example: equivalent to `WHERE 'foo' <= name <= 'hoo'`, using the default 
lexicographic sorting order
 
 ```json
 {
@@ -314,7 +272,7 @@ This filter expresses the condition `foo <= name <= hoo`, 
using the default lexi
 }
 ```
 
-Using strict bounds, this filter expresses the condition `21 < age < 31`
+### Example: equivalent to `WHERE 21 < age < 31`
 
 ```json
 {
@@ -328,7 +286,7 @@ Using strict bounds, this filter expresses the condition 
`21 < age < 31`
 }
 ```
 
-The user can also specify a one-sided bound by omitting "upper" or "lower". 
This filter expresses `age < 31`.
+### Example: equivalent to `WHERE age < 31`
 
 ```json
 {
@@ -340,7 +298,7 @@ The user can also specify a one-sided bound by omitting 
"upper" or "lower". This
 }
 ```
 
-Likewise, this filter expresses `age >= 18`
+### Example: equivalent to `WHERE age >= 18`
 
 ```json
 {
@@ -352,18 +310,154 @@ Likewise, this filter expresses `age >= 18`
 ```
 
 
+## Range filter
+
+The range filter is a replacement for the bound filter. It compares against 
any type of column and is designed to have has more SQL compliant behavior than 
the bound filter. It won't match null values, even if you don't specify a lower 
bound.
+
+Druid's SQL planner uses the range filter by default instead of bound filter 
whenever `druid.generic.useDefaultValueForNull=false`, or if 
`sqlUseBoundAndSelectors` is set to false on the [SQL query 
context](./sql-query-context.md).
+
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "range".| Yes |
+| `column` | Input column or virtual column name to filter. | Yes |
+| `matchValueType` | String specifying the type of bounds to match. For 
example `STRING`, `LONG`, `DOUBLE`, `FLOAT`, `ARRAY<STRING>`, `ARRAY<LONG>`, or 
any other Druid type. The `matchValueType` determines how Druid interprets the 
`matchValue` to assist in converting to the type of the matched `column` and 
also defines the type of comparison used when matching values. | Yes |
+| `lower` | Lower bound value to match. | No. At least one of `lower` or 
`upper` must not be null. |
+| `upper` | Upper bound value to match. | No. At least one of `lower` or 
`upper` must not be null. |
+| `lowerOpen` | Boolean indicating if lower bound is open in the interval of 
values defined by the range (">" instead of ">="). | No |
+| `upperOpen` | Boolean indicating if upper bound is open on the interval of 
values defined by range ("<" instead of "<="). | No |
+
+### Example: equivalent to `WHERE 21 <= age <= 31`
+
+```json
+{
+    "type": "range",
+    "column": "age",
+    "matchValueType": "LONG",
+    "lower": 21,
+    "upper": 31
+}
+```
+
+### Example: equivalent to `WHERE 'foo' <= name <= 'hoo'`, using STRING 
comparison
+
+```json
+{
+    "type": "range",
+    "column": "name",
+    "matchValueType": "STRING",
+    "lower": "foo",
+    "upper": "hoo"
+}
+```
+
+### Example: equivalent to `WHERE 21 < age < 31`
+
+```json
+{
+    "type": "range",
+    "column": "age",
+    "matchValueType": "LONG",
+    "lower": "21",
+    "lowerOpen": true,
+    "upper": "31" ,
+    "upperOpen": true
+}
+```
+
+### Example: equivalent to `WHERE age < 31`
+
+```json
+{
+    "type": "range",
+    "column": "age",
+    "matchValueType": "LONG",
+    "upper": "31" ,
+    "upperOpen": true
+}
+```
+
+### Example: equivalent to `WHERE age >= 18`
+
+```json
+{
+    "type": "range",
+    "column": "age",
+    "matchValueType": "LONG",
+    "lower": 18
+}
+```
+
+### Example: equivalent to `WHERE ARRAY['a','b','c'] < arrayColumn < 
ARRAY['d','e','f']`, using ARRAY comparison
+
+```json
+{
+    "type": "range",
+    "column": "name",
+    "matchValueType": "ARRAY<STRING>",
+    "lower": ["a","b","c"],
+    "lowerOpen": true,
+    "upper": ["d","e","f"],
+    "upperOpen": true
+}
+```
+
+
+## Like filter
+
+Like filters can be used for basic wildcard searches. They are equivalent to 
the SQL LIKE operator. Special characters
+supported are "%" (matches any number of characters) and "\_" (matches any one 
character).
+
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "like".| Yes |
+| `dimension` | Input column or virtual column name to filter. | Yes |
+| `pattern` | String LIKE pattern, such as "foo%" or "___bar".| Yes |
+| `escape`| A string escape character that can be used to escape special 
characters. | No |
+| `extractionFn` | [Extraction 
function](./dimensionspecs.md#extraction-functions) to apply to `dimension` 
prior to value matching. See [filtering with extraction 
functions](#filtering-with-extraction-functions) for details. | No |
+
+Like filters support the use of extraction functions, see [Filtering with 
Extraction Functions](#filtering-with-extraction-functions) for details.
+
+### Example: equivalent of `WHERE last_name LIKE "D%"` (last_name starts with 
"D")
+
+```json
+{
+    "type": "like",
+    "dimension": "last_name",
+    "pattern": "D%"
+}
+```
+
+## Regular expression filter
+
+The regular expression filter is similar to the selector filter, but using 
regular expressions. It matches the specified dimension with the given pattern.
+
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "regex".| Yes |
+| `dimension` | Input column or virtual column name to filter. | Yes |
+| `pattern` | String pattern to match - any standard [Java regular 
expression](http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html).
 | Yes |
+| `extractionFn` | [Extraction 
function](./dimensionspecs.md#extraction-functions) to apply to `dimension` 
prior to value matching. See [filtering with extraction 
functions](#filtering-with-extraction-functions) for details. | No |
+
+Note that it is often more optimal to use a like filter instead of a regex for 
simple matching of prefixes.
+
+### Example: matches values that start with "50."
+
+``` json
+{ "type": "regex", "dimension": "someColumn", "pattern": ^50.* }
+```
+
 ## Interval filter
 
 The Interval filter enables range filtering on columns that contain long 
millisecond values, with the boundaries specified as ISO 8601 time intervals. 
It is suitable for the `__time` column, long metric columns, and dimensions 
with values that can be parsed as long milliseconds.
 
 This filter converts the ISO 8601 intervals to long millisecond start/end 
ranges and translates to an OR of Bound filters on those millisecond ranges, 
with numeric comparison. The Bound filters will have left-closed and right-open 
matching (i.e., start <= time < end).
 
-|property|type|description|required?|
-|--------|-----------|---------|---------|
-|type|String|This should always be "interval".|yes|
-|dimension|String|The dimension to filter on|yes|
-|intervals|Array|A JSON array containing ISO-8601 interval strings. This 
defines the time ranges to filter on.|yes|
-|extractionFn|[Extraction function](#filtering-with-extraction-functions)| 
Extraction function to apply to the dimension|no|
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "interval". | Yes |
+| `dimension` | Input column or virtual column name to filter. | Yes |
+| `intervals` | A JSON array containing ISO-8601 interval strings that defines 
the time ranges to filter on. | Yes |
+| `extractionFn` | [Extraction 
function](./dimensionspecs.md#extraction-functions) to apply to `dimension` 
prior to value matching. See [filtering with extraction 
functions](#filtering-with-extraction-functions) for details. | No |
 
 The interval filter supports the use of extraction functions, see [Filtering 
with Extraction Functions](#filtering-with-extraction-functions) for details.
 
@@ -410,6 +504,157 @@ The filter above is equivalent to the following OR of 
Bound filters:
 }
 ```
 
+
+## True filter
+A filter which matches all values. You can use it to temporarily disable other 
filters without removing them.
+
+```json
+{ "type" : "true" }
+```
+
+## False filter
+A filter matches no values. You can use it to force a query to match no values.
+
+```json
+{"type": "false" }
+```
+
+
+## Search filter
+
+You can use search filters to filter on partial string matches.
+
+```json
+{
+    "filter": {
+        "type": "search",
+        "dimension": "product",
+        "query": {
+          "type": "insensitive_contains",
+          "value": "foo"
+        }
+    }
+}
+```
+
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "search". | Yes |
+| `dimension` | Input column or virtual column name to filter. | Yes |
+| `query`| A JSON object for the type of search. See [search query 
spec](#search-query-spec) for more information. | Yes |
+| `extractionFn` | [Extraction 
function](./dimensionspecs.md#extraction-functions) to apply to `dimension` 
prior to value matching. See [filtering with extraction 
functions](#filtering-with-extraction-functions) for details. | No |
+
+### Search query spec
+
+#### Contains
+
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "contains". | Yes |
+| `value` | A String value to search. | Yes |
+| `caseSensitive` | Whether the string comparison is case-sensitive or not. | 
No, default is false (insensitive) |
+
+#### Insensitive contains
+
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "insensitive_contains". | Yes |
+| `value` | A String value to search. | Yes |
+
+Note that an "insensitive_contains" search is equivalent to a "contains" 
search with "caseSensitive": false (or not
+provided).
+
+#### Fragment
+
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "fragment". | Yes |
+| `values` | A JSON array of string values to search. | Yes |
+| `caseSensitive` | Whether the string comparison is case-sensitive or not. | 
No, default is false (insensitive) |
+
+
+
+## Expression filter
+
+The expression filter allows for the implementation of arbitrary conditions, 
leveraging the Druid expression system. This filter allows for complete 
flexibility, but it might be less performant than a combination of the other 
filters on this page because it can't always use the same optimizations 
available to other filters.
+
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "expression" | Yes |
+| `expression` | Expression string to evaluate into true or false. See the 
[Druid expression system](math-expr.md) for more details. | Yes |
+
+### Example: expression based matching
+
+```json
+{ 
+    "type" : "expression" ,
+    "expression" : "((product_type == 42) && (!is_deleted))"
+}
+```
+
+
+## JavaScript filter
+
+The JavaScript filter matches a dimension against the specified JavaScript 
function predicate. The filter matches values for which the function returns 
true.
+
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "javascript" | Yes |
+| `dimension` | Input column or virtual column name to filter. | Yes |
+| `function` | JavaScript function which accepts the dimension value as a 
single argument, and returns either true or false. | Yes |
+| `extractionFn` | [Extraction 
function](./dimensionspecs.md#extraction-functions) to apply to `dimension` 
prior to value matching. See [filtering with extraction 
functions](#filtering-with-extraction-functions) for details. | No |
+
+### Example: matching any dimension values for the dimension `name` between 
`'bar'` and `'foo'`
+
+```json
+{
+  "type" : "javascript",
+  "dimension" : "name",
+  "function" : "function(x) { return(x >= 'bar' && x <= 'foo') }"
+}
+```
+
+> JavaScript-based functionality is disabled by default. Refer to the Druid 
[JavaScript programming guide](../development/javascript.md) for guidelines 
about using Druid's JavaScript functionality, including instructions on how to 
enable it.
+
+
+## Extraction filter
+
+> The extraction filter is now deprecated. Use the selector filter with an 
extraction function instead.
+
+Extraction filter matches a dimension using a specific [extraction 
function](./dimensionspecs.md#extraction-functions).
+The following filter matches the values for which the extraction function has 
a transformation entry `input_key=output_value` where
+`output_value` is equal to the filter `value` and `input_key` is present as a 
dimension.
+
+| Property | Description | Required |
+| -------- | ----------- | -------- |
+| `type` | Must be "extraction" | Yes |
+| `dimension` | Input column or virtual column name to filter. | Yes |
+| `value` | String value to match. | No. If not specified the filter will 
match NULL values. |
+| `extractionFn` | [Extraction 
function](./dimensionspecs.md#extraction-functions) to apply to `dimension` 
prior to value matching. See [filtering with extraction 
functions](#filtering-with-extraction-functions) for details. | No |
+
+### Example: matching dimension values in `[product_1, product_3, product_5]` 
for the column `product`
+
+```json
+{
+    "filter": {
+        "type": "extraction",
+        "dimension": "product",
+        "value": "bar_1",
+        "extractionFn": {
+            "type": "lookup",
+            "lookup": {
+                "type": "map",
+                "map": {
+                    "product_1": "bar_1",
+                    "product_5": "bar_1",
+                    "product_3": "bar_1"
+                }
+            }
+        }
+    }
+}
+```
+
 ## Filtering with extraction functions
 
 All filters except the "spatial" filter support extraction functions.
@@ -420,9 +665,7 @@ If specified, the extraction function will be used to 
transform input values bef
 The example below shows a selector filter combined with an extraction 
function. This filter will transform input values
 according to the values defined in the lookup map; transformed values will 
then be matched with the string "bar_1".
 
-
-**Example**
-The following matches dimension values in `[product_1, product_3, product_5]` 
for the column `product`
+### Example: matches dimension values in `[product_1, product_3, product_5]` 
for the column `product`
 
 ```json
 {
@@ -449,29 +692,97 @@ The following matches dimension values in `[product_1, 
product_3, product_5]` fo
 
 Druid supports filtering on timestamp, string, long, and float columns.
 
-Note that only string columns have bitmap indexes. Therefore, queries that 
filter on other column types will need to
+Note that only string columns and columns produced with the ['auto' ingestion 
spec](../ingestion/ingestion-spec.md#dimension-objects) also used by [type 
aware schema 
discovery](../ingestion/schema-design.md#type-aware-schema-discovery) have 
bitmap indexes. Queries that filter on other column types must
 scan those columns.
 
+### Filtering on multi-value string columns
+
+All filters return true if any one of the dimension values is satisfies the 
filter.
+
+#### Example: multi-value match behavior
+Given a multi-value STRING row with values `['a', 'b', 'c']`, a filter such as
+
+```json
+{ "type": "equality", "column": "someMultiValueColumn", "matchValueType": 
"STRING", "matchValue": "b" }
+```
+will successfully match the entire row. This can produce sometimes unintuitive 
behavior when coupled with the implicit UNNEST functionality of Druid 
[GroupBy](./groupbyquery.md) and [TopN](./topnquery.md) queries.
+
+Additionally, contradictory filters may be defined and perfectly legal in 
native queries which will not work in SQL.
+
+#### Example: SQL "contradiction"
+This query is impossible to express as is in SQL since it is a contradiction 
that the SQL planner will optimize to false and match nothing.
+
+Given a multi-value STRING row with values `['a', 'b', 'c']`, and filter such 
as
+```json
+{
+  "type": "and",
+  "fields": [
+    {
+      "type": "equality",
+      "column": "someMultiValueColumn",
+      "matchValueType": "STRING",
+      "matchValue": "a"
+    },
+    {
+      "type": "equality",
+      "column": "someMultiValueColumn",
+      "matchValueType": "STRING",
+      "matchValue": "b"
+    }
+  ]
+}
+```
+will successfully match the entire row, but not match a row with value `['a', 
'c']`.
+
+To express this filter in SQL, use [SQL multi-value string 
functions](./sql-multivalue-string-functions.md) such as `MV_CONTAINS`, which 
can be optimized by the planner to the same native filters.
+
 ### Filtering on numeric columns
 
-When filtering on numeric columns, you can write filters as if they were 
strings. In most cases, your filter will be
+Some filters, such as equality and range filters allow accepting numeric match 
values directly since they include a secondary `matchValueType` parameter.
+
+When filtering on numeric columns using string based filters such as the 
selector, in, and bounds filters, you can write filter match values as if they 
were strings. In most cases, your filter will be
 converted into a numeric predicate and will be applied to the numeric column 
values directly. In some cases (such as
 the "regex" filter) the numeric column values will be converted to strings 
during the scan.
 
-For example, filtering on a specific value, `myFloatColumn = 10.1`:
+#### Example: filtering on a specific value, `myFloatColumn = 10.1`
 
 ```json
-"filter": {
+{
+  "type": "equality",
+  "dimension": "myFloatColumn",
+  "matchValueType": "FLOAT",
+  "value": 10.1
+}
+```
+
+or with a selector filter:
+
+```json
+{
   "type": "selector",
   "dimension": "myFloatColumn",
   "value": "10.1"
 }
 ```
 
-Filtering on a range of values, `10 <= myFloatColumn < 20`:
+#### Example: filtering on a range of values, `10 <= myFloatColumn < 20`
 
 ```json
-"filter": {
+{
+  "type": "range",
+  "column": "myFloatColumn",
+  "matchvalueType": "FLOAT",
+  "lower": 10.1,
+  "lowerOpen": false,
+  "upper": 20.9,
+  "upperOpen": true
+}
+```
+
+or with a bound filter:
+
+```json
+{
   "type": "bound",
   "dimension": "myFloatColumn",
   "ordering": "numeric",
@@ -488,22 +799,33 @@ Query filters can also be applied to the timestamp 
column. The timestamp column
 to the timestamp column, use the string `__time` as the dimension name. Like 
numeric dimensions, timestamp filters
 should be specified as if the timestamp values were strings.
 
-If the user wishes to interpret the timestamp with a specific format, 
timezone, or locale, the [Time Format Extraction 
Function](./dimensionspecs.md#time-format-extraction-function) is useful.
+If you want to interpret the timestamp with a specific format, timezone, or 
locale, the [Time Format Extraction 
Function](./dimensionspecs.md#time-format-extraction-function) is useful.
+
+#### Example: filtering on a long timestamp value
+
+```json
+{
+  "type": "equality",
+  "dimension": "__time",
+  "matchValueType": "LONG",
+  "value": 124457387532
+}
+```
 
-For example, filtering on a long timestamp value:
+or with a selector filter:
 
 ```json
-"filter": {
+{
   "type": "selector",
   "dimension": "__time",
   "value": "124457387532"
 }
 ```
 
-Filtering on day of week:
+#### Example: filtering on day of week using an extraction function
 
 ```json
-"filter": {
+{
   "type": "selector",
   "dimension": "__time",
   "value": "Friday",
@@ -516,7 +838,7 @@ Filtering on day of week:
 }
 ```
 
-Filtering on a set of ISO 8601 intervals:
+#### Example: filtering on a set of ISO 8601 intervals
 
 ```json
 {
@@ -529,25 +851,3 @@ Filtering on a set of ISO 8601 intervals:
 }
 ```
 
-### True filter
-The true filter is a filter which matches all values.  It can be used to 
temporarily disable other filters without removing the filter.
-
-```json
-
-{ "type" : "true" }
-```
-
-### Expression filter
-The expression filter allows for the implementation of arbitrary conditions, 
leveraging the Druid expression system. 
-
-This filter allows for more flexibility, but it might be less performant than 
a combination of the other filters on this page due to the fact that not all 
filter optimizations are in place yet.
-
-```json
-
-{ 
-    "type" : "expression" ,
-    "expression" : "((product_type == 42) && (!is_deleted))"
-}
-```
-
-See the [Druid expression system](math-expr.md) for more details.
diff --git a/docs/querying/sql-query-context.md 
b/docs/querying/sql-query-context.md
index e469fa390a..7798fbf34c 100644
--- a/docs/querying/sql-query-context.md
+++ b/docs/querying/sql-query-context.md
@@ -44,6 +44,7 @@ Configure Druid SQL query planning using the parameters in 
the table below.
 |`enableTimeBoundaryPlanning`|If true, SQL queries will get converted to 
TimeBoundary queries wherever possible. TimeBoundary queries are very efficient 
for min-max calculation on `__time` column in a datasource 
|`druid.query.default.context.enableTimeBoundaryPlanning` on the Broker 
(default: false)|
 |`useNativeQueryExplain`|If true, `EXPLAIN PLAN FOR` will return the explain 
plan as a JSON representation of equivalent native query(s), else it will 
return the original version of explain plan generated by Calcite.<br /><br 
/>This property is provided for backwards compatibility. It is not recommended 
to use this parameter unless you were depending on the older 
behavior.|`druid.sql.planner.useNativeQueryExplain` on the Broker (default: 
true)|
 |`sqlFinalizeOuterSketches`|If false (default behavior in Druid 25.0.0 and 
later), `DS_HLL`, `DS_THETA`, and `DS_QUANTILES_SKETCH` return sketches in 
query results, as documented. If true (default behavior in Druid 24.0.1 and 
earlier), sketches from these functions are finalized when they appear in query 
results.<br /><br />This property is provided for backwards compatibility with 
behavior in Druid 24.0.1 and earlier. It is not recommended to use this 
parameter unless you were depending [...]
+|`sqlUseBoundAndSelectors`|If false (default behavior if 
`druid.generic.useDefaultValueForNull=false` in Druid 27.0.0 and later), the 
SQL planner will use [equality](./filters.md#equality-filter), 
[null](./filters.md#null-filter), and [range](./filters.md#range-filter) 
filters instead of [selector](./filters.md#selector-filter) and 
[bounds](./filters.md#bound-filter). This value must be set to `false` for 
correct behavior for filtering `ARRAY` typed values. | Defaults to same value 
as `d [...]
 
 ## Setting the query context
 The query context parameters can be specified as a "context" object in the 
[JSON API](../api-reference/sql-api.md) or as a [JDBC connection properties 
object](../api-reference/sql-jdbc.md).


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


Reply via email to