adarshsanjeev commented on code in PR #18235:
URL: https://github.com/apache/druid/pull/18235#discussion_r2210146766
##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/DataServerQueryHandlerUtils.java:
##########
@@ -48,17 +56,65 @@ private DataServerQueryHandlerUtils()
* Performs necessary transforms to a query destined for data servers. Does
not update the list of segments; callers
* should do this themselves using {@link
Queries#withSpecificSegments(Query, List)}.
*
- * @param query the query
- * @param dataSource datasource name
+ * @param query the query
+ * @param dataSourceName datasource name
*/
- public static <R, T extends Query<R>> Query<R> prepareQuery(final T query,
final String dataSource)
+ public static <R, T extends Query<R>> Query<R> prepareQuery(
+ final T query,
+ final int inputNumber,
+ final String dataSourceName
+ )
{
// MSQ changes the datasource to an inputNumber datasource. This needs to
be changed back for data servers
// to understand.
+ return query.withDataSource(transformDatasource(query.getDataSource(),
inputNumber, dataSourceName));
+ }
- // BUG: This transformation is incorrect; see
https://github.com/apache/druid/issues/18198. It loses decorations
- // such as join, unnest, etc.
- return query.withDataSource(new TableDataSource(dataSource));
+ /**
+ * Transforms {@link InputNumberDataSource} and {@link
RestrictedInputNumberDataSource}, which are only understood
+ * by MSQ tasks, back into {@link TableDataSource} and {@link
RestrictedDataSource} recursivly.
+ */
+ static DataSource transformDatasource(
+ final DataSource dataSource,
+ final int inputNumber,
+ final String dataSourceName
+ )
Review Comment:
Yes, this is not the best way to handle it, however, this check would need
to be present as a sanity check, and stops the query from returning any
incorrect result.
>identify the shape which should be rejected
This is a bit more difficult to do accurately. To be perfectly accurate the
error only needs to be thrown if the datasource being queried actually has
realtime segments, and this information is not present at compilation time. The
alternate is to fail queries which are querying any datasources if they have
any broadcast joins or unions on them, but this has a chance to fail queries
that would otherwise pass.
Since this is a Druid 34 blocker, I wanted to prevent any incorrect results
without causing any regressions by failing it more eagerly. Is there a way to
fail it at compile time accurately?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]