jihoonson commented on pull request #12250:
URL: https://github.com/apache/druid/pull/12250#issuecomment-1048358326


   >     1. Building the TaskConfig for tests.  Might I suggest that we move to 
a builder pattern to make those objects for the tests.  That would allow us to 
add new parameters without having to update all of the places in the tests 
where we just want the default.
   
   I agree that this is going to be useful. But, as you said, this is not going 
to help with this PR, but can rather make it worse due to increased PR size. 
I'd like to do it as a follow-up if you are OK.
   
   >     2. When using the IndexMergerV9.  We started passing this in as a 
parameter to a method and that meant that we had to funnel all of the things 
through those methods.  I think we can make the choice of whether to persist 
the null columns a field on `IndexMergerV9` instead of a parameter to the 
methods.  The nicest way to do this is probably to add a boolean field to 
`IndexMergerV9` like `persistNullColumns` and then also have a 
`setPersistNullColumns` method to override it.  We would call 
`setPersistNullColumns` from `TaskToolboxFactory`, which should be able to both 
check the context parameters _and_ look at `TaskConfig`
   >
   >     3. Constructing a new Appenderator.  This gets all over the code 
because it is using a static method to do the construction instead of a 
`Factory`.  It might be nicer to have an `AppenderatorFactory` that the 
`TaskToolbox` can return which could house various things as fields instead of 
needing parameters passed in.  For this one in particular, the `TaskToolbox` 
has access to both the `TaskConfig` _and_ the original `Task` object, so we 
should be able to add a method to `TaskToolbox` to get this configuration 
value.  `TaskToolbox` is already passed into the method that is used to 
construct the `Appenderator` so it should then just be able to be used to get 
the boolean value and now those call sites do not need to be changed.
   
   For 2) and 3), I added `IndexMergerV9Factory` and injected it in 
`TaskToolboxFactory`. `IndexMergerV9Factory` currently accepts one parameter, 
i.e., `storeEmptyColumns`, for creating `IndexMergerV9` but I imagine it can be 
extended to accept other parameters as well, such as 
`SegmentWriteOutMediumFactory`, that are configurable using task context. I 
didn't do it in this PR though as it will increase the PR size. 
`TaskToolboxFactory.build()` now contextualizes the `storeEmptyColumns` flag 
per the system property and the context of the given task. This way, the flag 
is fed directly from the task to `IndexMergerV9` instead of going through task 
-> appenderator -> indexMerger. This introduces a change that `IndexMergerV9` 
can be created per task instead of sharing a singleton instance. It's possible 
to create 2 instances with flag on and off and reuse them, but I don't see that 
that's going to be useful. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to