jihoonson commented on pull request #12250: URL: https://github.com/apache/druid/pull/12250#issuecomment-1048358326
> 1. Building the TaskConfig for tests. Might I suggest that we move to a builder pattern to make those objects for the tests. That would allow us to add new parameters without having to update all of the places in the tests where we just want the default. I agree that this is going to be useful. But, as you said, this is not going to help with this PR, but can rather make it worse due to increased PR size. I'd like to do it as a follow-up if you are OK. > 2. When using the IndexMergerV9. We started passing this in as a parameter to a method and that meant that we had to funnel all of the things through those methods. I think we can make the choice of whether to persist the null columns a field on `IndexMergerV9` instead of a parameter to the methods. The nicest way to do this is probably to add a boolean field to `IndexMergerV9` like `persistNullColumns` and then also have a `setPersistNullColumns` method to override it. We would call `setPersistNullColumns` from `TaskToolboxFactory`, which should be able to both check the context parameters _and_ look at `TaskConfig` > > 3. Constructing a new Appenderator. This gets all over the code because it is using a static method to do the construction instead of a `Factory`. It might be nicer to have an `AppenderatorFactory` that the `TaskToolbox` can return which could house various things as fields instead of needing parameters passed in. For this one in particular, the `TaskToolbox` has access to both the `TaskConfig` _and_ the original `Task` object, so we should be able to add a method to `TaskToolbox` to get this configuration value. `TaskToolbox` is already passed into the method that is used to construct the `Appenderator` so it should then just be able to be used to get the boolean value and now those call sites do not need to be changed. For 2) and 3), I added `IndexMergerV9Factory` and injected it in `TaskToolboxFactory`. `IndexMergerV9Factory` currently accepts one parameter, i.e., `storeEmptyColumns`, for creating `IndexMergerV9` but I imagine it can be extended to accept other parameters as well, such as `SegmentWriteOutMediumFactory`, that are configurable using task context. I didn't do it in this PR though as it will increase the PR size. `TaskToolboxFactory.build()` now contextualizes the `storeEmptyColumns` flag per the system property and the context of the given task. This way, the flag is fed directly from the task to `IndexMergerV9` instead of going through task -> appenderator -> indexMerger. This introduces a change that `IndexMergerV9` can be created per task instead of sharing a singleton instance. It's possible to create 2 instances with flag on and off and reuse them, but I don't see that that's going to be useful. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
