clintropolis commented on pull request #12291: URL: https://github.com/apache/druid/pull/12291#issuecomment-1062142843
> * string dimension, no dictionary <-- i.e., what we get from an expression that isn't backed by a single dictionary-coded string column this one uses scanAndAggregateWithCardinalityUnknown because those columns which are `1:*` or `*:*` should not report themselves as dictionary encoded/name lookup possible in advance > * string dimension, dictionary coded, unique (1-1 mapping from keys -> values) <-- i.e., what we get from a regular column in a segment, or a dictionary-coded string column plus an ExtractionFn that is ONE_TO_ONE > * string dimension, dictionary coded, nonunique <-- i.e., what we get from an expression backed by a single dictionary-coded string column, or a dictionary-coded string column plus an ExtractionFn that is MANY_TO_ONE, or an IndexedTable these two cases use `scanAndAggregateWithCardinalityKnown`, prior to this patch the latter case used `scanAndAggregateWithCardinalityUnknown`, which is better for `IndexedTable` (since the dictionaryIds never repeat so always has to still perform the value lookupName and hash table lookup), but far worse for expressions and lookups, which never report their dictionaries as unique currently and potentially have many repeated dictionary ids. Should I do something to clear up the javadocs? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
