clintropolis opened a new pull request, #14351:
URL: https://github.com/apache/druid/pull/14351

   ### Description
   I think we should consider switching the `IndexSpec` default value of 
`stringDictionaryEncoding` to `{"type":"frontCoded", "bucketSize":4, 
"formatVersion":1}`.
   
   Based on measurements #13854 things look pretty good and we have been 
running version 0 of the format for some time on a number of datasources 
without any notable performance loss, and version 1 for a smaller amount of 
time. I think by the time 27 is released it should be sufficiently baked in to 
feel confident about it being the default.
   
   However, this means that upgrading from versions older than 26 will need 
special consideration, so it is important to call out in the release notes if 
we go forward with this.
   
   #### Release note
   Front coding was originally introduced in Druid 25.0, and an improved 
'version 1' was introduced in Druid 26.0, with typically faster read speed and 
smaller storage size, has become the default in Druid 27.0. This means by 
default, segments created with Druid 27.0 are backwards compatible with Druid 
26.0, but not compatible with Druid versions older than 26.0. If upgrading to 
Druid 27.0 from a version older than 26.0, the `stringDictionaryEncoding` 
should be set to `{"type": "utf8"}` to keep writing out the older format to 
enable seamless downgrades to Druid 25.0 and older, and then later is 
recommended to be changed to the new default once determined that rollback is 
not necessary.
   
   
   <hr>
   
   This PR has:
   
   - [x] been self-reviewed.
   - [x] added documentation for new or modified features or behaviors.
   - [x] a release note entry in the PR description.
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to