gianm commented on issue #17891:
URL: https://github.com/apache/druid/issues/17891#issuecomment-2797391837
Like @clintropolis I also believe and hope that Druid 32 would do a better
job with `IN` filters generally. In addition to #16039 (native numeric `IN`
filter) there is also #16388 (speeds up planning). If there are still issues in
Druid 32 then we should keep working on it.
> I'm wondering if we can introduce 'Default Behaviour Change' label to PRs
and add a dedicated section in the release note, or add such label to the
description of related features/changes in the release note?
I do think it makes sense to add another label like "Upgrade Notes" for
something that should definitely appear in upgrade notes. I will say why. I dug
a little into what happened here. The original PR #14319 is tagged "Release
Notes" and has release note text from @clintropolis that says:
> A new broker configuration,
druid.sql.planner.metadataColumnTypeMergePolicy adds configurable modes to how
column types are computed for the SQL table schema when faced with differences
between segments. A new leastRestrictive mode allows choosing the most
appropriate type that data across all segments can best be coerced into, and is
now the default behavior. This is a subtle behavior change around when segment
driven schema migrations will take effect for the SQL schema. With
latestInterval, the SQL schema will be updated as soon as the first job with
the new schema has published segments in the latest time interval of the data,
while using the new leastRestrictive mode, the schema will only be updated once
all segments are reindexed to the new type. However, leastRestrictive is likely
to have "better" query time behavior and eliminates some query time errors and
other oddities that can occur when using latestInterval.
It does mention "new default" and "subtle behavior change", and also
justifies why the change was made ("eliminates some query time errors and other
oddities").
However, the actual release notes did not use this text, it says:
> You can now better control how Druid reacts to schema changes between
segments. This can make querying more resilient when newer segments introduce
different types, such as if a column previously contained LONG values and newer
segments contain STRING.
>
> Use the new Broker configuration,
druid.sql.planner.metadataColumnTypeMergePolicy to control how column types are
computed for the SQL table schema when faced with differences between segments.
>
>Set it to one of the following:
>
> leastRestrictive: the schema only updates once all segments are
reindexed to the new type.
> latestInterval: the SQL schema gets updated as soon as the first job
with the new schema publishes segments in the latest time interval of the data.
>
> leastRestrictive can have better query time behavior and eliminates some
query time errors that can occur when using latestInterval.
It appears in the "Additional features and improvements" and doesn't mention
that this is a behavior change. So it seems that @clintropolis's text was
rewritten and lost some important information.
Typically when pulling together release notes, the release manager will look
at changes tagged "Release Notes" and "Incompatible". If we had another label
like "Upgrade Notes" for something that should definitely appear in upgrade
notes, that would guard against omissions like this. The problem with "Release
Notes" by itself is that it's too broad: it can be there for any reason,
including just being an important new feature.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]