amrishlal commented on a change in pull request #6942:
URL: https://github.com/apache/incubator-pinot/pull/6942#discussion_r635595775
##########
File path:
pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
##########
@@ -1682,6 +1685,74 @@ static void validateRequest(PinotQuery pinotQuery, int
queryResponseLimit) {
if (!queryOptions.isGroupByModeSQL() ||
!queryOptions.isResponseFormatSQL()) {
throw new IllegalStateException("SQL query should always have response
format and group-by mode set to SQL");
}
+
+ // Validate column names from query
+ Set<String> columnsFromQuery = getColumnsFromQuery(pinotQuery);
+ if (!columnNamesFromSchema.containsAll(columnsFromQuery)) {
+ brokerMetrics.addMeteredTableValue(rawTableName,
BrokerMeter.INVALID_COLUMN_NAMES_IN_QUERY, 1L);
+ }
+ }
+
+ /**
+ * Fetch column names from the SQL query.
+ * @param pinotQuery pinot query
+ */
+ private static Set<String> getColumnsFromQuery(PinotQuery pinotQuery) {
Review comment:
Most queries should have valid column names, so from what I understand
this check is only useful for those few queries that won't have invalid column
names, yet to detect those few cases we seem to be putting the cost on all
queries.
I don't mean to say that we shouldn't do this check (semantic checking of
the query is useful), but it would be good do it in a way where multiple checks
(invalid column, invalid table name, a string column name being passed to a
function that takes numerical columns - avg for example, etc) are handled
within one or two recursive decent passes over the query tree . Minor changes
to the query tree could also be made such as what is being done in
`updateColumnNames` function which seems to be traversing the query tree very
similar to this code. `updateColumnNames` and this function could be combined
to reduce the number of passes.
Also `DataSchemaSegmentPruner` (on server side) is already checking if a
column name is present in segment. I am wondering if incrementing the metric
there would work?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]