This is an automated email from the ASF dual-hosted git repository.

vogievetsky pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
     new 7f3c117e3a SQL: Improve docs around casts. (#13466)
7f3c117e3a is described below

commit 7f3c117e3ab0971dd52180ac99d4da6cc092b598
Author: Gian Merlino <[email protected]>
AuthorDate: Thu Dec 15 15:01:40 2022 -0800

    SQL: Improve docs around casts. (#13466)
    
    Main change: clarify that the "default value" for casts only applies if
    druid.generic.useDefaultValueForNull = true.
    
    Secondary change: adjust a bunch of wording from future to present tense.
---
 docs/querying/sql-data-types.md | 37 +++++++++++++++++++++----------------
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/docs/querying/sql-data-types.md b/docs/querying/sql-data-types.md
index 11c8cda0fc..bb1766d27e 100644
--- a/docs/querying/sql-data-types.md
+++ b/docs/querying/sql-data-types.md
@@ -40,14 +40,17 @@ milliseconds since 1970-01-01 00:00:00 UTC, not counting 
leap seconds. Therefore
 timezone information, but only carry information about the exact moment in 
time they represent. See the
 [Time functions](sql-scalar.md#date-and-time-functions) section for more 
information about timestamp handling.
 
-The following table describes how Druid maps SQL types onto native types at 
query runtime. Casts between two SQL types
-that have the same Druid runtime type will have no effect, other than 
exceptions noted in the table. Casts between two
-SQL types that have different Druid runtime types will generate a runtime cast 
in Druid. If a value cannot be properly
-cast to another value, as in `CAST('foo' AS BIGINT)`, the runtime will 
substitute a default value. NULL values cast
-to non-nullable types will also be substituted with a default value (for 
example, nulls cast to numbers will be
-converted to zeroes).
-
-|SQL type|Druid runtime type|Default value|Notes|
+Casts between two SQL types with the same Druid runtime type (see below table) 
have no effect, other than exceptions
+noted in the table. Casts between two SQL types that have different Druid 
runtime types generate a runtime cast in
+Druid. If a value cannot be cast to the target type, as in `CAST('foo' AS 
BIGINT)`, Druid either substitutes a default
+value (when `druid.generic.useDefaultValueForNull = true`, the default mode), 
or substitutes [NULL](#null-values) (when
+`druid.generic.useDefaultValueForNull = false`). NULL values cast to 
non-nullable types are also substituted with a
+default value. For example, if `druid.generic.useDefaultValueForNull = true`, 
a null VARCHAR cast to BIGINT is converted
+to a zero.
+
+The following table describes how Druid maps SQL types onto native types when 
running queries.
+
+|SQL type|Druid runtime type|Default value<sup>*</sup>|Notes|
 |--------|------------------|-------------|-----|
 |CHAR|STRING|`''`||
 |VARCHAR|STRING|`''`|Druid STRING columns are reported as VARCHAR. Can include 
[multi-value strings](#multi-value-strings) as well.|
@@ -64,21 +67,23 @@ converted to zeroes).
 |DATE|LONG|`0`, meaning 1970-01-01|Casting TIMESTAMP to DATE rounds down the 
timestamp to the nearest day. Casts between string and date types assume 
standard SQL formatting, e.g. `2000-01-02`. For handling other formats, use one 
of the [time functions](sql-scalar.md#date-and-time-functions).|
 |OTHER|COMPLEX|none|May represent various Druid column types such as 
hyperUnique, approxHistogram, etc.|
 
+<sup>*</sup> Default value applies if `druid.generic.useDefaultValueForNull = 
true` (the default mode). Otherwise, the default value is `NULL` for all types.
+
 ## Multi-value strings
 
 Druid's native type system allows strings to potentially have multiple values. 
These
-[multi-value string dimensions](multi-value-dimensions.md) will be reported in 
SQL as `VARCHAR` typed, and can be
-syntactically used like any other VARCHAR. Regular string functions that refer 
to multi-value string dimensions will be
+[multi-value string dimensions](multi-value-dimensions.md) are reported in SQL 
as `VARCHAR` typed, and can be
+syntactically used like any other VARCHAR. Regular string functions that refer 
to multi-value string dimensions are
 applied to all values for each row individually. Multi-value string dimensions 
can also be treated as arrays via special
 [multi-value string functions](sql-multivalue-string-functions.md), which can 
perform powerful array-aware operations.
 
-Grouping by a multi-value expression will observe the native Druid multi-value 
aggregation behavior, which is similar to
+Grouping by a multi-value expression observes the native Druid multi-value 
aggregation behavior, which is similar to
 the `UNNEST` functionality available in some other SQL dialects. Refer to the 
documentation on
 [multi-value string dimensions](multi-value-dimensions.md) for additional 
details.
 
 > Because multi-value dimensions are treated by the SQL planner as `VARCHAR`, 
 > there are some inconsistencies between how
 > they are handled in Druid SQL and in native queries. For example, 
 > expressions involving multi-value dimensions may be
-> incorrectly optimized by the Druid SQL planner: `multi_val_dim = 'a' AND 
multi_val_dim = 'b'` will be optimized to
+> incorrectly optimized by the Druid SQL planner: `multi_val_dim = 'a' AND 
multi_val_dim = 'b'` is optimized to
 > `false`, even though it is possible for a single row to have both "a" and 
 > "b" as values for `multi_val_dim`. The
 > SQL behavior of multi-value dimensions will change in a future release to 
 > more closely align with their behavior
 > in native queries.
@@ -90,11 +95,11 @@ runtime property controls Druid's NULL handling mode. For 
the most SQL compliant
 
 When `druid.generic.useDefaultValueForNull = true` (the default mode), Druid 
treats NULLs and empty strings
 interchangeably, rather than according to the SQL standard. In this mode Druid 
SQL only has partial support for NULLs.
-For example, the expressions `col IS NULL` and `col = ''` are equivalent, and 
both will evaluate to true if `col`
-contains an empty string. Similarly, the expression `COALESCE(col1, col2)` 
will return `col2` if `col1` is an empty
-string. While the `COUNT(*)` aggregator counts all rows, the `COUNT(expr)` 
aggregator will count the number of rows
+For example, the expressions `col IS NULL` and `col = ''` are equivalent, and 
both evaluate to true if `col`
+contains an empty string. Similarly, the expression `COALESCE(col1, col2)` 
returns `col2` if `col1` is an empty
+string. While the `COUNT(*)` aggregator counts all rows, the `COUNT(expr)` 
aggregator counts the number of rows
 where `expr` is neither null nor the empty string. Numeric columns in this 
mode are not nullable; any null or missing
-values will be treated as zeroes.
+values are treated as zeroes.
 
 When `druid.generic.useDefaultValueForNull = false`, NULLs are treated more 
closely to the SQL standard. In this mode,
 numeric NULL is permitted, and NULLs and empty strings are no longer treated 
as interchangeable. This property


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to