clintropolis opened a new pull request #11818:
URL: https://github.com/apache/druid/pull/11818


   ### Description
   
   Currently, the `PostAggregator` interface defines a method `getType` which 
returns a `ColumnType` for use when populating a `RowSignature`. Many 
`PostAggregator` implementations have fixed return types, and are so 
independent of the underlying `RowSignature`, but some, such as 
`FieldAccessPostAggregator` and `ExpressionAggregator`, are sensitive to their 
inputs.
   
   Currently these dynamic return type `PostAggregator` implementations are 
unable to produce an output type until they have been `decorated` which 
provides access to a map of column names to `AggregatorFactory`, which provide 
their own type information. The only place using the output type of 
`PostAggregators` is `RowSignature`, so if we modify the `getType` method to 
accept a `ColumnInspector`, we can compute the output prior to decoration, if 
the `RowSignature` provides itself as the `ColumnInspector`.
   
   The `ExpressionPostAggregator` was missing an output type completely, 
despite the fact that Druid expression system for some time now has been able 
to infer the output type of an expression if the inputs to all of its free 
variables are known, so it has been updated to support producing the output 
type from either the inspector, _or_ from the `decorate` method, if it has been 
called.
   
   I also updated `FieldAccessPostAggregator` to be able to compute it's output 
type from the `ColumnInspector` if decorate has not been called, but did not 
yet do `FinalizingFieldAccessPostAggregator`, as I think we would need a 
clearer split between intermediary and finalized `RowSignature` that is not yet 
in place.
   
   I left the 'decoration computes type' pattern in place, and the code prefers 
those over the `ColumnInspector` computed types because I wasn't completely 
sure if there would be any consequences to this, but it might very be possible 
to remove them from these aggregators in the future if it isn't needed.
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `PostAggregator`
    * `ExpressionPostAggregator`
    * `FieldAccessPostAggregator`
   
   <hr>
   
   
   This PR has:
   - [x] been self-reviewed.
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to