paul-rogers commented on issue #13816:
URL: https://github.com/apache/druid/issues/13816#issuecomment-1464364704

   Thanks, @vogievetsky for the comment. The auto-detection of rollup was in 
response to someone who _didn't_ like the idea of a flag.
   
   As it turns out, the approach outlined here is not actually achievable. It 
will work better for the catalog to not be in the "rollup-or-not" 
"measure-or-dimension" business, but rather just to state storage types. Rollup 
then becomes a property of the ingestion query, not the datasource. This allows 
a use case in which early data is detail and later data is rolled up.
   
   Also, it turns out that our aggregations are not quite ready for the level 
of metadata envisioned here. All we really can know is the storage type. Thus, 
a simple `long` or a `SUM(long)`, `MIN(long)` and `MAX(long)` are all the same 
at the physical level, so the catalog actually cannot tell them apart. Again, 
it is up to each query to choose an aggregate that works for that ingestion.
   
   So, the revised proposal will be that the user specifies the storage type, 
as a native Druid type. Even there, it turns out that the Calcite planner only 
knows about finalized types, not intermediate types. There is thought that, 
eventually, Druid will offer distinct functions for intermediate and final 
aggregators. That is some time off. 
   
   Or, the catalog could list the finalized type and validate the finalized 
aggregators against that type, even though MSQ will actually use some other 
type for intermediate aggregates.
   
   So, in the short term, perhaps the catalog will apply only to detail tables, 
but not rollup because type information in that case is not sufficient to allow 
any meaningful validation. Once the project leads sort out how MSQ aggregation 
will work, the catalog can implement whatever choices we make.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to