clintropolis commented on code in PR #15588:
URL: https://github.com/apache/druid/pull/15588#discussion_r1446752690
##########
web-console/src/druid-models/dimension-spec/dimension-spec.ts:
##########
@@ -64,6 +65,12 @@ export const DIMENSION_SPEC_FIELDS: Field<DimensionSpec>[] =
[
defaultValue: 'SORTED_ARRAY',
suggestions: ['SORTED_ARRAY', 'SORTED_SET', 'ARRAY'],
},
+ {
+ name: 'castToType',
+ type: 'string',
+ defined: typeIsKnown(KNOWN_TYPES, 'auto'),
+ suggestions: [undefined, 'ARRAY<STRING>', 'ARRAY<LONG>', 'ARRAY<DOUBLE>'],
Review Comment:
>When you say that ARRAY<FLOAT> works too but is just mapped to
ARRAY<DOUBLE> are you advocating that we add it to the suggestions?
Nah, I don't think it is necessary since they will be handled the same as
`ARRAY<DOUBLE>` when used by 'auto', was just mentioning all of the valid
strings (there are actually a few extras for backwards compatibility
https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/segment/column/Types.java#L37)
>Remember that a standard dimension spec type can be flooooot (it gets
mapped to string) but we don't want to suggest that.
Yea, that is true, though this is a bit different because `castToType` only
supports real types else if it cant parse them it falls back to full 'auto'
behavior (though I suppose this also could have been an error to not repeat the
silent different behavior of the dimension schema types themselves).
>Why would we not use 'auto' with 'castToType' everywhere 🤔 ? I really like
it because it let's us get away from dimension types (which have a confusing
relationship with column types).
I think eventually we do want to use them everywhere, I just wasn't too sure
if now is the right time quite yet. There are a few reasons you might not want
to use it today. First is of course if you want MVDs, since the STRING column
produced by 'auto' will never make an MVD. The second would be I think if you
want numeric columns but do not want the indexes that all 'auto' numeric
columns have today, since this can make the segments larger, though other than
disk space there probably isn't much of a penalty for this if not actually
filtering the numbers, leaving the main cost just longer ingest time to build
the indexes. I also plan to eventually allow 'auto' and nested columns to be
customized to allow things like leaving out the indexes similar to what we can
do today with string columns (and eventually, specifying alternative indexes
once I implement them). Given that the second thing isn't much of an issue, I
don't really think it would be a problem to switch native to alway
s using 'auto' with 'castToType' for all cases except for MVDs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]