paul-rogers commented on issue #13816: URL: https://github.com/apache/druid/issues/13816#issuecomment-1468766668
Discussion has suggested that our goal is to add new agg functions that provide intermediate values, followed by as set of finalize functions that convert intermediate to finalized values. This is an interesting approach. But, it is not currently in the code and so not something that catalog support can build upon today. Given the future direction, we should wait for that work to be completed. Previously, the notion was to redefine the current SQL agg function return types differently in the rollup-vs-not use cases. For rollup, agg functions would be declared to return intermediate types. For non-rollup, the agg functions would be declared to return the finalized types otherwise. Sinde the Calcite types are just a fantasy, we just adjusted the fantasy to make it easier to validate metric types. That short-term fix, however, is not consistent with the longer-term direction. The result is that we have to work, short term, with finalized types. We can use finalized types in the catalog as proxies for the rollup types. For example, for `LATEST_BY(x, t)`, use the type of `x`, not the actual type of `COMPLEX<PAIR<s, LONG>>` where `s` is the type of `x`. This means that we can't actually tell MSQ the type to use, since MSQ must be free to use the intermediate type even if the Calcite layer thinks the query uses the finalized type. This may mean we can't enforce other types since the code may now know which is a metric and which is a dimension. For this, it means that types for rollup tables in the catalog are mostly for documentation: MSQ won't be able to enforce them because of the ambiguity around aggregate rollup types. This is probably OK: Druid works fine with whatever column types that MSQ produces. Moving forward, there are some exciting new schemaless features coming online that, by definition, can't be constrained by catalog definitions. After we implement the new set of rollup functions, we can sort out what exactly we want to enforce in MSQ. For now, let's get the basic metadata functionality in place so that we can have the broader discussion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
