jihoonson commented on a change in pull request #10503: URL: https://github.com/apache/druid/pull/10503#discussion_r506718820
########## File path: docs/querying/caching.md ########## @@ -82,3 +82,19 @@ Note that the task executor processes only support caches that keep their data l This restriction exists because the cache stores results at the level of intermediate partial segments generated by the ingestion tasks. These intermediate partial segments will not necessarily be identical across task replicas, so remote cache types such as `memcached` will be ignored by task executor processes. + +## Unsupported queries + +Query caching is not available for following: +- Queries, that involve a `union` datasource, do not support result-level caching. Refer to the +[related github issue](https://github.com/apache/druid/issues/8713) for details. Top level union SQL queries can still Review comment: ``` ../docs/querying/caching.md 90 | [related github issue](https://github.com/apa >> 1 spelling error found in 167 files ``` The CI is failing because of this line. Please add a suppression in `website/.spelling`. BTW, I think it should be `GitHub`. ########## File path: docs/querying/caching.md ########## @@ -82,3 +82,12 @@ Note that the task executor processes only support caches that keep their data l This restriction exists because the cache stores results at the level of intermediate partial segments generated by the ingestion tasks. These intermediate partial segments will not necessarily be identical across task replicas, so remote cache types such as `memcached` will be ignored by task executor processes. + +## Unsupported queries + +Query caching is not available for following +- queries, that have a union operation, do not support result-level caching - [More details](https://github.com/apache/druid/issues/8713) Review comment: > I was deliberate in avoiding datasource term since SQL users don't define `datasource` as such. For them, its just union operator. Even they don't define datasource by themselves, their query will be translated into native queries, which will determine whether it will be cached or not. I think it will be better to be precise so that users don't get confused. > Though I think Top Level Union queries may still be cached since they are not translated into a Union datasource. Good point, I'm not sure what you mean by "Top Level Union queries" though. In SQL, the union operator can be translated to either `DruidUnionDataSourceRule` or `DruidUnionRule`. The former is converted to a `union` datasource while the later is executed sequentially by the sql layer. AFAIT, the former can be used when it's `UNION ALL` of flat scan subqueries. The later can be used otherwise (still only for `UNION ALL`). So, the result-level cache cannot be used for the former, but can for the later. Maybe it could say, "Queries, that have a `union` datasource, do not support result-level caching. For SQL, a union SQL query can be translated to a native query with a `union` datasource when it is a `UNION ALL` of flat scan subqueries. These queries cannot be cached at the result-level." ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
