imply-cheddar edited a comment on pull request #12250: URL: https://github.com/apache/druid/pull/12250#issuecomment-1043846225
Okay, I think my comment about the runtime properties was offbase. TaskConfig *is* coming from runtime properties. I looked a lot deeper now and it seems like the extra files being touched is coming from 3 sources 1. Building the TaskConfig for tests. Might I suggest that we move to a builder pattern to make those objects for the tests. That would allow us to add new parameters without having to update all of the places in the tests where we just want the default. 2. When using the IndexMergerV9. We started passing this in as a parameter to a method and that meant that we had to funnel all of the things through those methods. I think we can make the choice of whether to persist the null columns a field on `IndexMergerV9` instead of a parameter to the methods. The nicest way to do this is probably to add a boolean field to `IndexMergerV9` like `persistNullColumns` and then also have a `setPersistNullColumns` method to override it. We would call `setPersistNullColumns` from `TaskToolboxFactory`, which should be able to both check the context parameters *and* look at `TaskConfig` 3. Constructing a new Appenderator. This gets all over the code because it is using a static method to do the construction instead of a `Factory`. It might be nicer to have an `AppenderatorFactory` that the `TaskToolbox` can return which could house various things as fields instead of needing parameters passed in. For this one in particular, the `TaskToolbox` has access to both the `TaskConfig` *and* the original `Task` object, so we should be able to add a method to `TaskToolbox` to get this configuration value. `TaskToolbox` is already passed into the method that is used to construct the `Appenderator` so it should then just be able to be used to get the boolean value and now those call sites do not need to be changed. So, if we do (1) above, it will make future changes better, but doesn't change the number of files touched for that reason. But if we do (2) and (3), I think we get a lot fewer files touched and much more centralized handling of the config and overrides, which should be beneficial for consistency. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
