paul-rogers commented on a change in pull request #11549:
URL: https://github.com/apache/druid/pull/11549#discussion_r682982767
##########
File path: docs/querying/segmentmetadataquery.md
##########
@@ -144,16 +144,16 @@ Types of column analyses are described below:
### cardinality
-* `cardinality` in the result will return the size of the bitmap index or
dictionary encoding for string dimensions, or null for other dimension types.
- If `merge` was set, the result will be the max of this value across segments.
Only relevant for dimension columns.
+* `cardinality` in the result will return the number of unique values present
in a string column. It is null for other column types.
+ If `merge` is set, the result will be the max of this value across segments.
Only relevant for string columns.
Review comment:
This is not clear to us newbies. Does "max" mean the largest number of
any segment, or the aggregated total across segments? Both are useful: if I
have 1M rows, and see a cardinality of 1K, that could mean either A) 1K total,
or B) 1K per segment. If there are 100K rows per segment, 1K per segment says
one thing. If there are 1K rows per segment, then a cardinality of 1K per
segment says something else.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]