clintropolis commented on code in PR #16704:
URL: https://github.com/apache/druid/pull/16704#discussion_r1681936535


##########
docs/release-info/migr-ansi-sql-null.md:
##########
@@ -0,0 +1,272 @@
+---
+id: migr-ansi-sql-null
+title: "Migration guide: SQL compliant mode"
+sidebar_label: SQL compliant mode
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+-->
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+In Apache Druid 28.0.0, the default [null 
handling](../querying/sql-data-types.md#null-values) mode changed to be 
compliant with the SQL standard.
+This guide provides strategies for Druid operators who rely on the legacy 
Druid null handling behavior in their applications.

Review Comment:
   ```suggestion
   This guide provides strategies for Druid operators who rely on the legacy 
Druid null handling behavior in their applications to transition to SQL 
compliant mode.
   ```
   Also I wonder if we should fit in the "why" which is because legacy mode is 
deprecated and eventually will be removed



##########
docs/release-info/migr-ansi-sql-null.md:
##########
@@ -0,0 +1,272 @@
+---
+id: migr-ansi-sql-null
+title: "Migration guide: SQL compliant mode"
+sidebar_label: SQL compliant mode
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+-->
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+In Apache Druid 28.0.0, the default [null 
handling](../querying/sql-data-types.md#null-values) mode changed to be 
compliant with the SQL standard.
+This guide provides strategies for Druid operators who rely on the legacy 
Druid null handling behavior in their applications.
+It provides strategies to emulate legacy null handling mode while operating 
Druid in SQL compliant null handling mode.
+
+## SQL compliant null handling in Druid
+
+Now, Druid writes segments in a SQL compatible null handling mode by default.
+This means that Druid stores null values distinctly from empty strings for 
string dimensions and distinctly from 0 for numeric dimensions.
+
+This can impact your application behavior because SQL the standard defines any 
comparison to null to be unknown.
+According to this three-value logic, `x <> 'some value'` only returns non-null 
values.
+
+The default Druid configurations for SQL compatible null handling mode is as 
follows:
+
+* `druid.generic.useDefaultValueForNull=false`
+* `druid.expressions.useStrictBooleans=true`
+* `druid.generic.useThreeValueLogicForNativeFilters=true` 
+
+Follow the [Null handling tutorial](../tutorials/tutorial-sql-null.md) to 
learn how the default null handling works in Druid.
+
+## Legacy null handling and two-value logic
+
+Prior to Druid 28.0.0, Druid defaulted to a legacy mode which stored default 
values instead of nulls.
+In legacy mode, Druid segments created at ingestion time have the following 
characteristics:
+
+- String columns can not distinguish an empty string, '', from null.
+    Therefore, Druid treats them both as interchangeable values.
+- Numeric columns can not represent null valued rows.
+    Therefore Druid stores 0 instead of null. 
+
+The Druid configurations for the deprecated legacy mode are as follows:
+
+* `druid.generic.useDefaultValueForNull=true`
+* `druid.expressions.useStrictBooleans=false`
+* `druid.generic.useThreeValueLogicForNativeFilters=true`
+
+Note that these configurations are deprecated and scheduled for removal.
+
+## Migrate to SQL compliant mode
+
+If your business logic relies on the behavior of legacy mode, you can emulate 
the null handling behavior while operating Druid in SQL compatible null 
handling mode.

Review Comment:
   While ingest time transformations to eliminate nulls are somewhat emulating 
the behavior, I feel like updating your queries is more just "transitioning 
your queries into standard SQL" to adapt to the change.
   
   Also we might need to be kind of careful with the word "emulate" since that 
isn't totally true. In legacy mode for strings, the empty value and null are 
used interchangably. This means like where clauses such as `WHERE x is null` 
would match the empty string in legacy mode, but this will not be true in SQL 
compatible mode, as the empty string is strictly the empty string value. I'm 
not entirely sure of a better word, but we might want to mention this quirk of 
legacy mode.
   
   I think that is maybe why I suggested when we had a chat offline that it is 
also possible to coerce the empty strings into `NULL` using the `NULLIF` 
function; if someone is more used to querying those values and treating them as 
`NULL` instead of the empty string, it might make migration easier if the input 
data has empty strings, or a mixture of null and empty strings.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to