clintropolis commented on code in PR #16673:
URL: https://github.com/apache/druid/pull/16673#discussion_r1680091282


##########
docs/querying/sql-data-types.md:
##########
@@ -34,12 +34,23 @@ Druid associates each column with a specific data type. 
This topic describes sup
 
 Druid natively supports the following basic column types:
 
-* LONG: 64-bit signed int
-* FLOAT: 32-bit float
-* DOUBLE: 64-bit float
-* STRING: UTF-8 encoded strings and string arrays
-* COMPLEX: non-standard data types, such as nested JSON, hyperUnique and 
approxHistogram, and DataSketches
-* ARRAY: arrays composed of any of these types
+* `LONG`: 64-bit signed int
+* `FLOAT`: 32-bit float
+* `DOUBLE`: 64-bit float
+* `STRING`: UTF-8 encoded strings and string arrays
+* `ARRAY`: arrays composed of any of these types
+
+## Complex types
+
+Druid natively supports the following complex types:
+* `COMPLEX<JSON>`: stores a copy of structured data in JSON format and 
specialized internal columns and indexes for nested basic types. Click here to 
learn more about [`COMPLEX<JSON>`](nested-columns.md)
+* `cardinality`: Data structure to compute the cardinality of Apache Druid 
dimensions using the HyperLogLog algorithm. Click here to learn more about 
[`cardinality`](hll-old.md#cardinality-aggregator)
+* `hyperUnique`: Data structure of aggregated values to estimate count 
distinct using a variant of the HyperLogLog approximation algorithm. Consider 
using HLL sketches for better accuracy in many cases. Click here to learn more 
about [`hyperUnique`](hll-old.md#hyperunique-aggregator)

Review Comment:
   if we are going to mention json as `COMPLEX<json>`, then this should 
probably be `COMPLEX<hyperUnique>`?
   
   Also its strange to tell someone to use something and not also include a 
description of that type... I'm not sure how we should handle extension 
`COMPLEX` types here, since there are a lot of them and I don't know that their 
docs do a very good job of articulating what type they store is if they are 
ingested into a rollup table



##########
docs/querying/sql-data-types.md:
##########
@@ -34,12 +34,23 @@ Druid associates each column with a specific data type. 
This topic describes sup
 
 Druid natively supports the following basic column types:
 
-* LONG: 64-bit signed int
-* FLOAT: 32-bit float
-* DOUBLE: 64-bit float
-* STRING: UTF-8 encoded strings and string arrays
-* COMPLEX: non-standard data types, such as nested JSON, hyperUnique and 
approxHistogram, and DataSketches
-* ARRAY: arrays composed of any of these types
+* `LONG`: 64-bit signed int
+* `FLOAT`: 32-bit float
+* `DOUBLE`: 64-bit float
+* `STRING`: UTF-8 encoded strings and string arrays
+* `ARRAY`: arrays composed of any of these types
+
+## Complex types
+
+Druid natively supports the following complex types:
+* `COMPLEX<JSON>`: stores a copy of structured data in JSON format and 
specialized internal columns and indexes for nested basic types. Click here to 
learn more about [`COMPLEX<JSON>`](nested-columns.md)
+* `cardinality`: Data structure to compute the cardinality of Apache Druid 
dimensions using the HyperLogLog algorithm. Click here to learn more about 
[`cardinality`](hll-old.md#cardinality-aggregator)

Review Comment:
   cardinality is an aggregator type, not a column type. it builds into a 
`COMPLEX<hyperUnique>` if stored in a column



##########
docs/querying/sql-data-types.md:
##########
@@ -64,7 +75,7 @@ The following table describes how Druid maps SQL types onto 
native types when ru
 |TIMESTAMP|LONG|`0`, meaning 1970-01-01 00:00:00 UTC|Druid's `__time` column 
is reported as TIMESTAMP. Casts between string and timestamp types assume 
standard SQL formatting, such as `2000-01-02 03:04:05`, not ISO 8601 
formatting. For handling other formats, use one of the [time 
functions](sql-scalar.md#date-and-time-functions).|
 |DATE|LONG|`0`, meaning 1970-01-01|Casting TIMESTAMP to DATE rounds down the 
timestamp to the nearest day. Casts between string and date types assume 
standard SQL formatting&mdash;for example, `2000-01-02`. For handling other 
formats, use one of the [time 
functions](sql-scalar.md#date-and-time-functions).|
 |ARRAY|ARRAY|`NULL`|Druid native array types work as SQL arrays, and 
multi-value strings can be converted to arrays. See [Arrays](#arrays) for more 
information.|
-|OTHER|COMPLEX|none|May represent various Druid column types such as 
hyperUnique, approxHistogram, etc.|
+|OTHER|COMPLEX|none|May represent various Druid column types such as 
hyperUnique, cardinality, etc.|

Review Comment:
   cardinality isn't a type, this should mention that it is dependent on which 
extensions are loaded and link to the extension docs. However, afaik none of 
the extension docs contain information on how the type is displayed in like 
`INFORMATION_SCHEMA` columns table, so those docs might also need updated to 
indicate how their types are presented.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to